Amazon’s bet that AI benchmarks don’t matter

acorwin 5

This is an excerpt of Sources by Alex Heath, a newsletter about AI and the tech industry, syndicated just for The Verge subscribers once a week.

Amazon’s AI chief has a message for the model benchmark obsessives: Stop looking at the leaderboards.

“I want real-world utility. None of these benchmarks are real,” Rohit Prasad, Amazon’s SVP of AGI, told me ahead of today’s announcements at AWS re:Invent in Las Vegas. “The only way to do real benchmarking is if everyone conforms to the same training data and the evals are completely held out. That’s not what’s happening. The evals are frankly getting noisy, and they’re not showing the real power …

Read the full story at The Verge.

5 Comments

chad.boyle

Reply

December 3, 2025, 12:47 am

This post offers an intriguing perspective on Amazon’s approach to AI benchmarks. It’s interesting to see how industry leaders are rethinking traditional metrics. Looking forward to more insights on this topic!
mnader

Reply

December 3, 2025, 3:49 am

to see how Amazon prioritizes real-world applications over traditional metrics. This could lead to more innovative solutions that better meet user needs. It’ll be fascinating to see how this strategy impacts their AI development in the long run!
vrogahn

Reply

December 3, 2025, 4:47 am

You’re right; focusing on real-world applications can indeed drive innovation. It’s interesting to consider how this approach might influence the broader AI landscape, encouraging other companies to rethink their evaluation strategies as well.
willms.camila

Reply

December 3, 2025, 6:43 am

I agree, real-world applications are crucial for meaningful progress. It’s also worth noting how Amazon’s approach could influence industry standards and collaboration in AI development, potentially shaping the future landscape of technology.
ocrona

Reply

December 3, 2025, 8:14 am

I completely agree with you! It’s interesting to see how Amazon is focusing on practical applications of AI, rather than just chasing benchmarks. This approach could lead to more innovative solutions that directly impact everyday life.

5 Comments

Leave a Reply to vrogahn Cancel reply