GenAI Hallucinations Fixable

Are GenAI Hallucinations Fixable? Maybe Not.: Artificial Intelligence Trends

Sure, GenAI chatbots tend to “hallucinate”. But those GenAI hallucinations are fixable, right? According to one article, maybe not.

As reported in Fortune (Tech experts are starting to doubt that ChatGPT and A.I. ‘hallucinations’ will ever go away: ‘This isn’t fixable’, written by Matt O’Brien and available here), Anthropic, ChatGPT-maker OpenAI and other major developers of AI systems known as large language models say they’re working to make them more truthful.

How long that will take — and whether they will ever be good enough to, say, safely dole out medical advice — remains to be seen.

KLDiscovery

“I don’t think that there’s any model today that doesn’t suffer from some hallucination,” said Daniela Amodei, co-founder and president of Anthropic, maker of the chatbot Claude 2.

“They’re really just sort of designed to predict the next word,” Amodei said. “And so there will be some rate at which the model does that inaccurately.”

“This isn’t fixable,” said Emily Bender, a linguistics professor and director of the University of Washington’s Computational Linguistics Laboratory. “It’s inherent in the mismatch between the technology and the proposed use cases.”

A lot is riding on the reliability of generative AI technology. The McKinsey Global Institute projects it will add the equivalent of $2.6 trillion to $4.4 trillion to the global economy. Chatbots are only one part of that frenzy, which also includes technology that can generate new images, video, music and computer code. Nearly all of the tools include some language component.

KLDiscovery

Google is already pitching a news-writing AI product to news organizations, for which accuracy is paramount. The Associated Press is also exploring use of the technology as part of a partnership with OpenAI, which is paying to use part of AP’s text archive to improve its AI systems. Maybe the chatbots will lead to a new association of the term “fake news”? 😉

Sam Altman, the CEO of OpenAI, expressed optimism, if not an outright commitment in addressing the hallucination issues.

“I think we will get the hallucination problem to a much, much better place,” Altman said. “I think it will take us a year and a half, two years. Something like that. But at that point we won’t still talk about these. There’s a balance between creativity and perfect accuracy, and the model will need to learn when you want one or the other.”

Even Altman, as he markets the products for a variety of uses, doesn’t count on the models to be truthful when he’s looking for information for himself.

“I probably trust the answers that come out of ChatGPT the least of anybody on Earth,” Altman told a recent gathering to laughter.

Are GenAI hallucinations fixable? Maybe not. Test and verify those results – always.

By the way, OpenAI has filed a trademark application for “GPT-5”, so they’re already working on the next iteration of their model. Does this mean that “Number 5 is alive?” 😀

So, what do you think? Are you concerned about that GenAI hallucinations may not be fixable? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by my employer, my partners or my clients. eDiscovery Today is made available solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Today should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

2 comments

  1. There is growing evidence that hallucinations happen when you ask an LLM for answer that is based on its internal training. If the LLM doesn’t know the answer, it will sometimes make up answers. And sometimes this can cause serious embarrassment, like when you file a court brief based on cases made up by GPT.

    We don’t see hallucinations when we ask GPT to base its answers on the documents and other information we submit for analysis. The LLMs seem to be more reliable in answering based on the information we provide and will regularly tell us when the information provided doesn’t contain the information they need to form an answer isn’t present in the materials.

    It is hard to prove a negative, but that is our experience so far. In our system we provide links to the documents used as a basis for the answer. The original text is just a click away and easily verified.

    These are growing pains for a precocious toddler just learning to walk. Have no doubt that fundamental change in legal workflow has started and will not turn back.

  2. Emily Bender has it right: this isn’t fixable. And every computational linguistics expert agrees with her. As she says, it’s inherent in the mismatch between the technology and the proposed use cases. For lawyers, an issue. But for marketing firms and the “creative side”, not an issue. And remember: Google first raised the issue waaaay back in 2018 when it released a paper that noted deep-neural-network software is exciting but it often perceives things that aren’t there. And they are the guys that developed the “T” in GPT.

Leave a Reply