Google’s AMIE Tried Taking Patient Histories in a Real Clinic. Here’s What Happened.

Google’s been showing off AMIE, their conversational medical AI, for a while now. We saw it ace diagnostic challenges in simulated settings and chat with patient actors. But as anyone who’s worked in healthcare knows, the gap between a controlled demo and a busy clinic floor is a chasm.

Now they’ve taken the next step. In partnership with Beth Israel Deaconess Medical Center (BIDMC), Google ran a prospective, single-center feasibility study where AMIE actually talked to real patients before their primary care visits. The results just dropped, and they’re worth a look.

The setup was straightforward but smart. Patients booked for new, non-emergency appointments got an invite to chat with AMIE via a secure web link before their in-person or telehealth visit. The AI handled the clinical history taking — asking about symptoms, timing, severity, that sort of thing. But here’s the key: a physician was watching the whole thing over a live video call with screen sharing, ready to jump in if the AI went off the rails. Think of it as training wheels for an AI intern.

The system then generated a transcript and summary for the actual doctor to review before the appointment. That’s a workflow that could actually save time if it works right.

What I find interesting is the safety architecture. They had pre-defined criteria for when the supervising physician should intervene — not just “if something seems wrong,” but specific thresholds. That’s the kind of rigor you need when you’re dealing with real patients, not actors. The IRB-approved protocol also made sure patients knew their participation was optional and wouldn’t affect their care.

The study was single-arm, so there’s no control group comparing AI-assisted vs. standard care. That limits what we can conclude about effectiveness. But for a feasibility study, that’s fine. The goal was to see if the system could work in a real clinical environment at all, not to prove it’s better than humans.

What’s not in the press release? The messy details. How many patients declined to participate? How often did the supervisor have to step in? What did the actual diagnostic accuracy look like compared to the clinicians’ final assessments? Those numbers matter, and I hope they’re in the full paper.

This is a step in the right direction. Too many AI in healthcare papers stop at “it worked in a simulation.” Moving into real-world testing, even with heavy supervision, is how we learn what actually breaks. And things always break when they hit reality.

I’d like to see the next study drop the training wheels a bit — maybe have the AI run unsupervised for a subset of low-risk cases, or compare its history-taking completeness against human medical assistants. That would tell us whether the technology is ready to actually reduce clinician workload, or if it’s just another demo that looks good in a video.

Google’s AMIE Tried Taking Patient Histories in a Real Clinic. Here’s What Happened.

Comments (0)