Why Generative AI Hallucinates

Dec 27, 2024

About hallucinations in Generative AI, why they happen, what you can do about it, and why it matters so much in healthcare.

Read →

3 Comments

Ronen Liroz

Jan 7Edited

#AIhalucinations

As part of my daily trial-and-error experiment with Gen AI, I asked ChatGPT the following question:

Who wrote the song "Shir Ahava Bedou’i" (Bedouin Love Song) by David Broza?

I received an answer detailing who wrote, composed, and performed the song. Initially, I didn’t realize the answer was incorrect since I wasn’t familiar with the identity of the actual writer. Once I discovered the error, I repeated the question and received a different set of incorrect answers. When I pointed out the mistake, I received a response saying, “I apologize, the correct answer is…”—but that, too, was incorrect.

To investigate further, I repeated the question approximately 10 more times and received 10 different incorrect responses. ChatGPT seemed to spiral into a "negative feedback loop."

For comparison, I switched to Claude AI and asked the same question. While I also received an incorrect answer there, the response to my correction was different. Claude replied with something like: “Please help me improve—can you tell me who wrote the song?” This indicated that it, too, would not provide the correct answer.

I then turned to Gemini by Google, where I received a response in a different style—this time correct. The reply was something like:

“There is some uncertainty regarding the identity of the writer, but most evidence points to [the correct name].”

It seemed to be statistically derived, yet accurate.

Note: Before composing this reflection, I asked ChatGPT the question again—and it still doesn’t provide the correct answer....

Expand full comment

Hadas Bitran

Dec 29

Example of a total hallucination - asked ChatGPT whether RadLex is included in UMLS and requested for evidence with links. It responded yes, it is, and even provided evidence with 2 links. But the truth is, RadLex is not part of UMLS. As for the evidence, the first link was real, but did not support the answer, and the second link didn't even exist. Relatively benign hallucination, but still annoying.

Expand full comment

SharkDOC

Jan 27

Confabulations has been a better terminology in my opinion of the Gen AI output that is incorrect. Unfortunately, hallucinations are the de facto term for the general population.

Expand full comment

Verge of Singularity

Why Generative AI Hallucinates