Why You Can’t Ignore AI Model Metrics (Even If You’re Using a Third-Party Model!)

October 29, 2024

Success metrics is everything for a product manager. They speak for how good your decisions were, and help determine future enhancements.

During my AI Product Management capstone, my team focused on business metrics for our imaginary app—a tool that leverages GPT-4o mini from OpenAI to create personalized recipes based on ingredients users already have (through text and image input), helping reduce food waste.

Why using GPT-4o mini? Simple: it's the cheapest for the complexity we needed.

But here’s where we slipped.

In our presentation, we proudly announced we wouldn't track AI model success metrics because we’re using a third-party model. Makes sense, right?

WRONG.

Turns out it is crucial that we measure the AI model success, but not to enhance it. Rather, to ensure we choose the right model in the first place. And this testing should have happened before we ship the app.

To measure the models’ success, we could:

1. Set benchmarks on core tasks such as recognizing ingredients and generating relevant recipes.

2. Run 1000 prompts with each model on those benchmark

3. Score and compare models, factoring in performance and cost.

This testing score would become the baseline to measure success of the model post launch on an ongoing basis.

Lesson learned: Business metrics are critical, but never underestimate the importance of AI model metrics—even when using third-party models.

Why You Can’t Ignore AI Model Metrics (Even If You’re Using a Third-Party Model!)

Read more blog posts

Case Study: How B2B e-commerce design has to differ from B2C e-commerce

Case Study: Reducing customer support tickets through enhancements

Payment gateway outage - my lesson learned for ticket sales platform

Pitfalls of AI chatbot in Customer Support