OpenAI’s Whisper is used as a transcription tool by doctors and hospitals. It should come as no surprise that it sometimes hallucinates.
As discussed in The Verge (Hospitals use a transcription tool powered by a hallucination-prone OpenAI model, written by Wes Davis and available here), researchers cited by ABC News have found that OpenAI’s Whisper, which powers a tool many hospitals use — sometimes just makes things up entirely.
Whisper is used by a company called Nabla for a medical transcription tool that it estimates has transcribed 7 million medical conversations, according to ABC News. More than 30,000 clinicians and 40 health systems use it, the outlet writes. Nabla is reportedly aware that Whisper can hallucinate, and is “addressing the problem.”
Not sure exactly what that means.
A group of researchers from Cornell University, the University of Washington, and others found in a study that Whisper hallucinated in about 1 percent of transcriptions, making up entire sentences with sometimes violent sentiments or nonsensical phrases during silences in recordings. The researchers, who gathered audio samples from TalkBank’s AphasiaBank as part of the study, note silence is particularly common when someone with a language disorder called aphasia is speaking.
One of the researchers, Allison Koenecke of Cornell University, posted examples like the one below in a thread about the study.

The researchers found that hallucinations also included invented medical conditions or phrases you might expect from a YouTube video, such as “Thank you for watching!” (OpenAI reportedly used to transcribe over a million hours of YouTube videos to train GPT-4.)
Transcription tools, even AI based tools, are far from perfect, but at least they provide a reasonable attempt to capture what was being said. Making things up through hallucinations is another matter. Regardless, auto transcriptions (whether AI-generated or not) require a lot of QC review on the backend. That’s even more important in the medical field, where the stakes are as high as they get.
So, what do you think? Are you concerned that medical professionals are relying on transcriptions from OpenAI’s Whisper? Please share any comments you might have or if you’d like to know more about a particular topic.
Image created using GPT-4o’s Image Creator Powered by DALL-E, using the term “robot whispering in the ear of another robot”.
Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by my employer, my partners or my clients. eDiscovery Today is made available solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Today should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.
Discover more from eDiscovery Today by Doug Austin
Subscribe to get the latest posts sent to your email.







