Plato

Login to Plato


This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Don't have an account? 

Back to resources

When AI Features Are Market-Ready

Managing Expectations
Product

10 June, 2021

Jason Mawdsley, Director System Software at NVIDIA, explains how to determine the maturity of AI features and why validation and verification are the most important part of the process.

Problem

AI and deep learning are new fields veiled by a lot of confusion and misunderstanding. From common problems such as how one does road mapping and product planning, or how engineering estimates are made, to more specific ones distinct for AI and deep learning that focus on validation and verification of a model that is scheduled to go to production, its long-term support, and maintenance.

One of the biggest challenges in the AI/deep learning space -- given the very nature of its products -- is determining if a feature is ready to go to market and if the model’s performance meets real-time constraints. Those constraints vary from performance to image quality constraints but also include identifying reliable partners with whom we could launch.

Actions taken

Our guiding principle is to measure ourselves against the speed of light in terms of performance. We have a mathematical representation of how fast a feature can run on a piece of given hardware, and we measure ourselves against that speed of light to see how close we are to the optimum. Depending on how close we are, we decide whether we are ready to release or not.

When we assess the image quality, we look at different image quality measurements -- what an image looks like, how sharp or accurate it is, etc. Some image quality metrics that we used in the past were reliant on what we call a golden eyeball. That means that people would be merely looking at an image, assessing how good it looked. It wasn’t standardized or rigorous, and we would end up getting regressions. As a result, some of our models could look great in one piece of product but wouldn’t meet the expectations in the other. Back then, we went with that because we didn’t have the scope of testing to catch those differences, but we developed metrics and testing that helped us continuously improve over time.

Partners are a critical component to consider when determining if a feature is market-ready. Our field is such that one cannot complete a feature, announce that and hope that someone would use it. We need to have customers or partners on board prior to launching. We target partners relevant to the feature who have industry sway and wide acceptance. When we look at our feature being integrated into their product -- whether it is a game, software, or app -- we look if that is going to reflect well on us. It’s a two-way street -- will they get the quality and performance they expect, and will we get the recognition we deserve.

However, the hardest part about AI and deep learning is getting the data and building up a test environment necessary to validate AI. Capturing content that is representative enough and collected from different sources is difficult and time-consuming. Yet, many people are not taking data and testing as seriously as they do research and software engineering.

In traditional software development, all one needs to do is hire a few QA engineers who would write a test plan. They would click over here and touch over there; their main concern would be to establish if they type x here and y there, would they get an expected value. The expected value in AI is all grey. Most of our models are not clearly better than our previous models. We think of our models as better on average -- better in some areas and worse in others.

Lessons learned

  • As hundreds of companies are looking to develop the next generation AI and deep learning features, determining the quality and market readiness of those features becomes more critical. There is a lot of grey around the acceptable quality because this is a new field with many unknowns popping up behind every corner. The acceptable error rate is changing per project as the new developments are constantly pushing for new standards.
  • Validation and verification of models are often the most difficult part of the process, more difficult than creating models or the weights. Especially when your features are becoming more and more popular and are increasingly integrated into different areas and types of devices. The type of AI and deep learning that I work on, will be integrated into hundreds of applications in the midst of real-world content that is constantly changing. Therefore, having a rigorous, reproducible, and agreed-upon methodology for validating features across a wide spectrum is exceedingly important.
  • It takes a lot of thinking, effort, and infrastructure -- from writing software to capturing data for validation -- to transition features from research to production. That included dedicated staff because traditional QA testers are not of much use. Their onboarding and training will take a significant amount of time. For example, training them to understand what is a game-side issue and what is a model issue will take so much time and won’t be worth the effort in a constantly changing environment.

Discover Plato

Scale your coaching effort for your engineering and product teams
Develop yourself to become a stronger engineering / product leader


Related stories

Writing a Manager’s User Manual
21 June

Xun Tang, Engineering Manager at Twitter, has written the playbook on what it means to be a successful member of her team.

Managing Expectations
Innovation / Experiment
Team Processes
Xun Tang

Xun Tang

Engineering Manager at Twitter Inc.

When AI Features Are Market-Ready
10 June

Jason Mawdsley, Director System Software at NVIDIA, explains how to determine the maturity of AI features and why validation and verification are the most important part of the process.

Managing Expectations
Product
Jason Mawdsley

Jason Mawdsley

Director, System Software at nVidia

Investing in Improvement Vs. Investing in a Potentially High-Yielding Bet
17 June

Heiko Reintsch, Head of Product at GetYourGuide, begs the question: should his company be focusing on improving what it produces currently, or should it be looking to expand its operations?

Product
Heiko Reintsch

Heiko Reintsch

Head of Product at GetYourGuide

Making the Most of User Data in Early-Stage Products
4 June

Apurva Mudgal, Director of Product and Partnerships at Haptik, explains what and how to measure in early-stage products, emphasizing how selective and/or biased interpretation of user data can lead to inaccurate conclusions.

Product
Users
Data
Apurva Mudgal

Apurva Mudgal

Head of Consumer Product at Simpl

The Limits to Improvement
4 June

Sailendra Kumar, VP Product Management & Design at IndeedFlex, describes his effort to improve a product that had some inherent limitations.

Managing Expectations
Product
Customers
Sailendra Kumar

Sailendra Kumar

Vice President - Product Management & Design at IndeedFlex ( Indeed.com)

You're a great engineer.
Become a great engineering leader.

Plato (platohq.com) is the world's biggest mentorship platform for engineering managers & product managers. We've curated a community of mentors who are the tech industry's best engineering & product leaders from companies like Facebook, Lyft, Slack, Airbnb, Gusto, and more.