When AI Features Are Market-Ready
10 June, 2021
AI and deep learning are new fields veiled by a lot of confusion and misunderstanding. From common problems such as how one does road mapping and product planning, or how engineering estimates are made, to more specific ones distinct for AI and deep learning that focus on validation and verification of a model that is scheduled to go to production, its long-term support, and maintenance.
One of the biggest challenges in the AI/deep learning space -- given the very nature of its products -- is determining if a feature is ready to go to market and if the model’s performance meets real-time constraints. Those constraints vary from performance to image quality constraints but also include identifying reliable partners with whom we could launch.
Our guiding principle is to measure ourselves against the speed of light in terms of performance. We have a mathematical representation of how fast a feature can run on a piece of given hardware, and we measure ourselves against that speed of light to see how close we are to the optimum. Depending on how close we are, we decide whether we are ready to release or not.
When we assess the image quality, we look at different image quality measurements -- what an image looks like, how sharp or accurate it is, etc. Some image quality metrics that we used in the past were reliant on what we call a golden eyeball. That means that people would be merely looking at an image, assessing how good it looked. It wasn’t standardized or rigorous, and we would end up getting regressions. As a result, some of our models could look great in one piece of product but wouldn’t meet the expectations in the other. Back then, we went with that because we didn’t have the scope of testing to catch those differences, but we developed metrics and testing that helped us continuously improve over time.
Partners are a critical component to consider when determining if a feature is market-ready. Our field is such that one cannot complete a feature, announce that and hope that someone would use it. We need to have customers or partners on board prior to launching. We target partners relevant to the feature who have industry sway and wide acceptance. When we look at our feature being integrated into their product -- whether it is a game, software, or app -- we look if that is going to reflect well on us. It’s a two-way street -- will they get the quality and performance they expect, and will we get the recognition we deserve.
However, the hardest part about AI and deep learning is getting the data and building up a test environment necessary to validate AI. Capturing content that is representative enough and collected from different sources is difficult and time-consuming. Yet, many people are not taking data and testing as seriously as they do research and software engineering.
In traditional software development, all one needs to do is hire a few QA engineers who would write a test plan. They would click over here and touch over there; their main concern would be to establish if they type x here and y there, would they get an expected value. The expected value in AI is all grey. Most of our models are not clearly better than our previous models. We think of our models as better on average -- better in some areas and worse in others.
- As hundreds of companies are looking to develop the next generation AI and deep learning features, determining the quality and market readiness of those features becomes more critical. There is a lot of grey around the acceptable quality because this is a new field with many unknowns popping up behind every corner. The acceptable error rate is changing per project as the new developments are constantly pushing for new standards.
- Validation and verification of models are often the most difficult part of the process, more difficult than creating models or the weights. Especially when your features are becoming more and more popular and are increasingly integrated into different areas and types of devices. The type of AI and deep learning that I work on, will be integrated into hundreds of applications in the midst of real-world content that is constantly changing. Therefore, having a rigorous, reproducible, and agreed-upon methodology for validating features across a wide spectrum is exceedingly important.
- It takes a lot of thinking, effort, and infrastructure -- from writing software to capturing data for validation -- to transition features from research to production. That included dedicated staff because traditional QA testers are not of much use. Their onboarding and training will take a significant amount of time. For example, training them to understand what is a game-side issue and what is a model issue will take so much time and won’t be worth the effort in a constantly changing environment.
Scale your coaching effort for your engineering and product teams
Develop yourself to become a stronger engineering / product leader
Xun Tang, Engineering Manager at Twitter, has written the playbook on what it means to be a successful member of her team.
Engineering Manager at Twitter Inc.
Jason Mawdsley, Director System Software at NVIDIA, explains how to determine the maturity of AI features and why validation and verification are the most important part of the process.
Director, System Software at nVidia
Heiko Reintsch, Head of Product at GetYourGuide, begs the question: should his company be focusing on improving what it produces currently, or should it be looking to expand its operations?
Head of Product at GetYourGuide
Apurva Mudgal, Director of Product and Partnerships at Haptik, explains what and how to measure in early-stage products, emphasizing how selective and/or biased interpretation of user data can lead to inaccurate conclusions.
Head of Consumer Product at Simpl
Sailendra Kumar, VP Product Management & Design at IndeedFlex, describes his effort to improve a product that had some inherent limitations.
Vice President - Product Management & Design at IndeedFlex ( Indeed.com)
You're a great engineer.
Become a great engineering leader.
Plato (platohq.com) is the world's biggest mentorship platform for engineering managers & product managers. We've curated a community of mentors who are the tech industry's best engineering & product leaders from companies like Facebook, Lyft, Slack, Airbnb, Gusto, and more.