Google's AI Overviews still lies constantly, and a 90% accuracy rate proves it

If you’ve used Google recently, you’ve probably seen AI Overviews — that Gemini-powered block of text that now sits at the top of search results before any actual links. It launched in 2024 and immediately became a punching bag for anyone who asked it anything beyond “what’s the weather.” Remember when it told people to eat glue? Yeah.

It’s gotten better since then. But “better” is a low bar when the starting point was actively dangerous.

A new analysis from The New York Times, done in collaboration with the AI startup Oumi, tried to actually measure how often AI Overviews gets things wrong. They used SimpleQA — a benchmark OpenAI released in 2024 that contains over 4,000 questions with verifiable answers — to test Google’s system. The result: AI Overviews is correct about 90 percent of the time.

That sounds decent until you think about what 10 percent means at Google’s scale.

Oumi started running these tests last year when Gemini 2.5 was the top model, and the accuracy sat at 85 percent. After the Gemini 3 update, they reran the test and got 91 percent. So yes, it’s improving. But here’s the thing nobody wants to say out loud: Google processes billions of searches every day. Even a 9 percent error rate means hundreds of thousands of incorrect answers every hour. Millions per day.

And that’s just from a benchmark test with 4,000 curated questions. Real-world queries are messier, more ambiguous, and way more likely to trip up an AI. The 90 percent figure is probably the best-case scenario.

I’ve been testing AI Overviews myself since it launched, and my experience tracks with this. It’s fine for factual stuff like “capital of France” or “who won the Super Bowl in 2020.” But ask it anything that requires nuance, recent context, or interpretation, and it starts hallucinating confidently. The worst part is how it presents everything with the same matter-of-fact tone, whether it’s right or completely made up.

What bothers me more is that Google knows this and still pushes AI Overviews as a default feature. There’s no opt-out for regular users. You have to install browser extensions or use workarounds to get the old search results back. The company is prioritizing AI adoption over accuracy, and the numbers prove it.

To be fair, 90 percent accuracy is competitive for generative AI models. GPT-4 and Claude land in similar ranges on SimpleQA. But those aren’t being served to billions of users as authoritative search results. Google has a responsibility that other AI companies don’t, and it’s not taking it seriously enough.

The Times article goes into more detail, but the headline says it all: Google’s AI is wrong 10 percent of the time, and at Google’s scale, that’s a lot of lies.

Google’s AI Overviews still lies constantly, and a 90% accuracy rate proves it

Comments (0)