close

Decoding AI Hallucinations: Understanding, Preventing, and Advancing

AI is rapidly becoming an integral part of our daily lives, driving innovations in everything from healthcare to finance. But there’s a perplexing phenomenon lurking beneath the surface: AI hallucinations.

AI systems have made strange predictions, generated fake news, insulted users, and professed their love. Given the headlines, we obviously have an emotional reaction to hallucinations. But is that really fair?

Are they just doing what they’re trained to do? And if that’s the case, how can we limit hallucination behavior?

To learn how to prevent AI hallucinations, we have to understand what they are, and how training large language models influences their behavior.

What Are AI Hallucinations?

Hallucinations are fabricated or incorrect outputs from large language models or AI-powered image recognition software. For example, you may have seen ChatGPT invent a historical event or provide a fictional biography of a real person. Or you may have used Dall-E to generate an image that includes most of the elements you asked for but adds in random new elements that you never requested.

You might’ve experienced hallucinations with AI-based translation services, too, such as translating a description of a famous cultural festival into a completely different fictional one. So, what’s driving this odd behavior?

Hallucinations Are a Result of How Large Language Models Learn

To understand why hallucinations happen, we have to understand how AI models learn. The starting point for most LLM learning is ingestion. Engineers feed models trillions of texts, from Wikipedia to fiction novels to encyclopedias to webpages. These sources serve as semantic and syntactical examples for the LLM, showing what words mean and how to arrange them properly in a sentence.

To test a model’s knowledge, engineers randomly obscure a word and ask the model to predict what it is based on the examples it’s already seen. This is why, for the most part, LLMs like ChatGPT work so well. You ask it a question, it serves up a semantically and syntactically reasonable answer. It’s done this a million times over.

But, here’s the catch—LLMs are not scored on accuracy. That means the answer you get from ChatGPT or Bard or Gemini could be semantically and syntactically reasonable and completely inaccurate. All the LLM knows how to do is generate text that’s 100% valid in the context of what it’s learned.

So hallucinations aren’t some kind of sentience or something we should take offense to; they’re to be expected.

When we have an emotional reaction to a hallucination, we are really treating these models unfairly—they haven’t been trained to do what we’re expecting them to do, which is to produce factually accurate replies. It’d be like reprimanding an 11-year-old for crashing a car when they’ve never driven before.

Risks Posed by AI Hallucinations

While we shouldn’t judge AI models by their hallucinations, we should treat them with some concern. Depending on the context, hallucinations can:

Erode Trust

When users encounter nonsensical outputs, their confidence in AI technology diminishes. This loss of trust can hinder the adoption of AI solutions in critical areas such as healthcare, finance, and customer service, where integrity is paramount.

For example, hallucinations could:

  • Mislead researchers by identifying wildlife that’s not actually present in a dense forest.
  • Misdiagnose a patient’s condition or share incorrect treatment recommendations, leading to severe health complications.
  • Misinterpret sensor data, forcing a self-driving car to make incorrect navigation decisions that cause life-threatening accidents.
  • Make erroneous financial predictions that cause significant financial losses and market instability.

Have Ethical and Legal Implications

AI hallucinations contribute to the spread of misinformation and disinformation. In the age of digital media, AI-generated content can easily be mistaken for accurate information, leading to the proliferation of false narratives. This is particularly problematic in areas like news, social media, and online platforms, where misinformation can influence public opinion and behavior.

At work, hallucinations can lead to legal disputes and complicate regulatory compliance, particularly if biased or unfair decisions are made based on hallucinated data. This could potentially exacerbate issues of discrimination and inequality.

Disrupt Operations

Incorporating AI systems into our workplace processes often improves overall company efficiency and employee productivity. But AI hallucinations have the opposite effect, interrupting operations and causing delays. In supply chain management, for example, hallucinations in demand forecasting can result in overstocking or stock outs, impacting the entire supply chain.

2 Ways To Combat AI Hallucinations

To decrease AI hallucinations, you have to change your model’s approach. And there are two ways to do it:

1. Adopt a Retrieval Augmented Generation-Supported Architecture  

The number one way to combat hallucination is to remove the work of gathering and assessing facts and give that task to a different model designed with accuracy in mind. That’s where retrieval-augmented generation (RAG) comes into play.

RAG retrieves the most relevant information from a large database using sophisticated models like BM25, dense retrieval methods, and other advanced search techniques. Importantly, RAG models are trained to be factually accurate, so you can rely on the answers they give you. Paired with other generative models, you get coherent and contextually appropriate responses that are more precise and grounded in real information.

If we keep the pieces most LLMs are good at—understanding language and responding in syntactically correct ways—and we replace the factual step with RAG, we get the best of both worlds.

Perplexity AI is a great example of a model that incorporates RAG into its architecture. It’s an AI-powered search engine and chatbot that uses natural language processing (NLP) and machine learning to answer user questions—but it’s trained to give you the facts, complete with citations.

2. Conduct Last-Mile Training With High-Quality Data

If you don’t or can’t take a RAG approach, you can create “specialty” systems from open-source models.

For example, say you’re building a chatbot for dermatologists. The model's outputs not only need to be factually correct, they need to be tailored to dermatology, specifically.

When building and training that model, you would start by having it ingest the same texts as any other large language model. But then, you’d conduct some extra last-mile training with high-quality data from the Journal of the American Academy of Dermatology, the International Journal of Dermatology, dermatology encyclopedias, etc., to give the model specialty experience in the domain that it needs. You can repeat this in any context to develop your own expert models.

Future-Proof Your AI Systems

Enhancing user trust and ensuring reliability involves thorough testing, ongoing monitoring, and continuous refinement of AI systems. Taking this proactive approach not only reduces the likelihood of AI hallucinations but encourages engagement and increases adoption, ultimately making your organization more productive and efficient.

But designing a repeatable process takes substantial time, thought, and effort. And it’s tough to do without expert guidance behind you.

Interested in learning more about AI Hallucinations? Kevin McCall explains in the video below!

Launch is on a mission to help every company on the planet embrace AI—whether they’re just getting started or are looking to apply AI in exciting ways. Accelerate your goals by booking one of our AI Transformation Workshops today.

Back to top

More from
Latest news

Discover latest posts from the NSIDE team.

Recent posts
About
This is some text inside of a div block.

AI is rapidly becoming an integral part of our daily lives, driving innovations in everything from healthcare to finance. But there’s a perplexing phenomenon lurking beneath the surface: AI hallucinations.

AI systems have made strange predictions, generated fake news, insulted users, and professed their love. Given the headlines, we obviously have an emotional reaction to hallucinations. But is that really fair?

Are they just doing what they’re trained to do? And if that’s the case, how can we limit hallucination behavior?

To learn how to prevent AI hallucinations, we have to understand what they are, and how training large language models influences their behavior.

What Are AI Hallucinations?

Hallucinations are fabricated or incorrect outputs from large language models or AI-powered image recognition software. For example, you may have seen ChatGPT invent a historical event or provide a fictional biography of a real person. Or you may have used Dall-E to generate an image that includes most of the elements you asked for but adds in random new elements that you never requested.

You might’ve experienced hallucinations with AI-based translation services, too, such as translating a description of a famous cultural festival into a completely different fictional one. So, what’s driving this odd behavior?

Hallucinations Are a Result of How Large Language Models Learn

To understand why hallucinations happen, we have to understand how AI models learn. The starting point for most LLM learning is ingestion. Engineers feed models trillions of texts, from Wikipedia to fiction novels to encyclopedias to webpages. These sources serve as semantic and syntactical examples for the LLM, showing what words mean and how to arrange them properly in a sentence.

To test a model’s knowledge, engineers randomly obscure a word and ask the model to predict what it is based on the examples it’s already seen. This is why, for the most part, LLMs like ChatGPT work so well. You ask it a question, it serves up a semantically and syntactically reasonable answer. It’s done this a million times over.

But, here’s the catch—LLMs are not scored on accuracy. That means the answer you get from ChatGPT or Bard or Gemini could be semantically and syntactically reasonable and completely inaccurate. All the LLM knows how to do is generate text that’s 100% valid in the context of what it’s learned.

So hallucinations aren’t some kind of sentience or something we should take offense to; they’re to be expected.

When we have an emotional reaction to a hallucination, we are really treating these models unfairly—they haven’t been trained to do what we’re expecting them to do, which is to produce factually accurate replies. It’d be like reprimanding an 11-year-old for crashing a car when they’ve never driven before.

Risks Posed by AI Hallucinations

While we shouldn’t judge AI models by their hallucinations, we should treat them with some concern. Depending on the context, hallucinations can:

Erode Trust

When users encounter nonsensical outputs, their confidence in AI technology diminishes. This loss of trust can hinder the adoption of AI solutions in critical areas such as healthcare, finance, and customer service, where integrity is paramount.

For example, hallucinations could:

  • Mislead researchers by identifying wildlife that’s not actually present in a dense forest.
  • Misdiagnose a patient’s condition or share incorrect treatment recommendations, leading to severe health complications.
  • Misinterpret sensor data, forcing a self-driving car to make incorrect navigation decisions that cause life-threatening accidents.
  • Make erroneous financial predictions that cause significant financial losses and market instability.

Have Ethical and Legal Implications

AI hallucinations contribute to the spread of misinformation and disinformation. In the age of digital media, AI-generated content can easily be mistaken for accurate information, leading to the proliferation of false narratives. This is particularly problematic in areas like news, social media, and online platforms, where misinformation can influence public opinion and behavior.

At work, hallucinations can lead to legal disputes and complicate regulatory compliance, particularly if biased or unfair decisions are made based on hallucinated data. This could potentially exacerbate issues of discrimination and inequality.

Disrupt Operations

Incorporating AI systems into our workplace processes often improves overall company efficiency and employee productivity. But AI hallucinations have the opposite effect, interrupting operations and causing delays. In supply chain management, for example, hallucinations in demand forecasting can result in overstocking or stock outs, impacting the entire supply chain.

2 Ways To Combat AI Hallucinations

To decrease AI hallucinations, you have to change your model’s approach. And there are two ways to do it:

1. Adopt a Retrieval Augmented Generation-Supported Architecture  

The number one way to combat hallucination is to remove the work of gathering and assessing facts and give that task to a different model designed with accuracy in mind. That’s where retrieval-augmented generation (RAG) comes into play.

RAG retrieves the most relevant information from a large database using sophisticated models like BM25, dense retrieval methods, and other advanced search techniques. Importantly, RAG models are trained to be factually accurate, so you can rely on the answers they give you. Paired with other generative models, you get coherent and contextually appropriate responses that are more precise and grounded in real information.

If we keep the pieces most LLMs are good at—understanding language and responding in syntactically correct ways—and we replace the factual step with RAG, we get the best of both worlds.

Perplexity AI is a great example of a model that incorporates RAG into its architecture. It’s an AI-powered search engine and chatbot that uses natural language processing (NLP) and machine learning to answer user questions—but it’s trained to give you the facts, complete with citations.

2. Conduct Last-Mile Training With High-Quality Data

If you don’t or can’t take a RAG approach, you can create “specialty” systems from open-source models.

For example, say you’re building a chatbot for dermatologists. The model's outputs not only need to be factually correct, they need to be tailored to dermatology, specifically.

When building and training that model, you would start by having it ingest the same texts as any other large language model. But then, you’d conduct some extra last-mile training with high-quality data from the Journal of the American Academy of Dermatology, the International Journal of Dermatology, dermatology encyclopedias, etc., to give the model specialty experience in the domain that it needs. You can repeat this in any context to develop your own expert models.

Future-Proof Your AI Systems

Enhancing user trust and ensuring reliability involves thorough testing, ongoing monitoring, and continuous refinement of AI systems. Taking this proactive approach not only reduces the likelihood of AI hallucinations but encourages engagement and increases adoption, ultimately making your organization more productive and efficient.

But designing a repeatable process takes substantial time, thought, and effort. And it’s tough to do without expert guidance behind you.

Interested in learning more about AI Hallucinations? Kevin McCall explains in the video below!

Launch is on a mission to help every company on the planet embrace AI—whether they’re just getting started or are looking to apply AI in exciting ways. Accelerate your goals by booking one of our AI Transformation Workshops today.

Back to top

More from
Latest news

Discover latest posts from the NSIDE team.

Recent posts
About
This is some text inside of a div block.

Launch Consulting Logo
Locations