When Retrieval Augmented Generation (RAG) Fails

Matt Furnari
11/25/2024

Retrieval Augmented Generation (RAG) sounds like a dream come true for anyone working with AI language models. The idea is simple: enhance models like ChatGPT with external data so they can provide answers based on information beyond their original training. Need your AI to answer questions about your company's internal documents or recent events not covered in its training data? RAG seems like the perfect solution.

But when we roll up our sleeves and implement RAG in the real world, things get messy. Let's dive into why RAG isn't always the magic fix we hope for and explore the hurdles that can trip us up along the way.

The Allure of RAG

At its heart, RAG is about bridging gaps in an AI's knowledge:
  • Compute Embeddings: Break down your documents into chunks and convert them into embeddings—numerical representations that capture the essence of the text.
  • Store and Retrieve: Keep these embeddings in a database. When a question comes in, find the chunks whose embeddings are most similar to the question.
  • Augment the AI: Feed these relevant chunks to the AI alongside the question, giving it the context it needs to generate an informed answer.
In theory, this means your AI can tap into any knowledge source you provide, even if that information isn't part of its original training.

The Reality Check

Despite its promise, implementing RAG isn't all smooth sailing. Here are some of the bumps you might hit on the road.

1. The Ever-Changing Embeddings

Embeddings are the foundation of RAG—they're how we represent text in a way that the AI can understand and compare. But here's the catch: embedding models keep evolving. New models offer better performance, but they come with their own embeddings that aren't compatible with the old ones.

So, you're faced with a dilemma:
  • Recompute All Embeddings: Every time a new model comes out, you could reprocess your entire document library to generate new embeddings. But if you're dealing with millions or billions of chunks, that's a hefty computational bill.
  • Stick with the Old Model: You might decide to keep using the old embeddings to save on costs. But over time, you miss out on improvements and possibly pay more for less efficient models.
  • Mix and Match: Use new embeddings for new documents and keep the old ones for existing data. But now your database is fragmented, and searching across different embedding spaces gets complicated.
There's no perfect solution. Some platforms, like SemDB.ai, try to ease the pain by allowing multiple embeddings in the same database, but the underlying challenge remains.

2. The Pronoun Problem

Language is messy. People use pronouns, references, and context that computers struggle with. Let's look at an example:
Original Text: "Chocolate cookies are made from the finest imported cocoa. They sell for $4 a dozen."
When we break this text into chunks for embeddings, we might get:
Chunk 1: "Chocolate cookies are made from the finest imported cocoa."
Chunk 2: "They sell for $4 a dozen."
Now, if someone asks, "How much do chocolate cookies cost?", the system searches for embeddings similar to the question. But Chunk 2 doesn't mention "chocolate cookies" explicitly—it uses "they." The AI might miss this chunk because the embedding doesn't match well with the question.

Solving It

One way to tackle this is by cleaning up the text before creating embeddings:
Chunk 1: "Chocolate cookies are made from the finest imported cocoa."
Chunk 2: "Chocolate cookies sell for $4 a dozen."
By replacing pronouns with the nouns they refer to, we make each chunk self-contained and easier for the AI to match with questions.

3. Navigating Domain-Specific Knowledge

Things get trickier with specialized or branded products. Imagine you have a product description like this:
"Introducing Darlings—the ultimate cookie experience that brings together the timeless flavors of vanilla and chocolate in perfect harmony... And at just $5 per dozen, indulgence has never been so affordable."
Extracting key facts:
Darlings are cookies.
Darlings combine vanilla and chocolate.
Darlings cost $5 per dozen.
Now, if someone asks, "How much are the chocolate and vanilla cookies?", they might not mention "Darlings" by name. The embeddings might prioritize more general chunks about chocolate or vanilla cookies, missing the specific info about Darlings.

4. The Limits of Knowledge Graphs

To overcome these issues, some suggest using Knowledge Graphs alongside RAG. Knowledge Graphs store information as simple relationships:
(Darlings, are, cookies)
(Darlings, cost, $5)
(Darlings, contain, chocolate and vanilla)
In theory, this structure makes it easy to retrieve specific facts. But reality isn't so tidy.

The Complexity of Real-World Information

Not all knowledge fits neatly into simple relationships. Consider:
"Bob painted the room red on Tuesday because he was feeling inspired."
Trying to capture all the nuances of this sentence in a simple graph gets complicated quickly. You need more than just triplets—you need context, causation, and temporal information.

Conflicting Information

Knowledge Graphs also struggle with contradictions or exceptions. For example:
(Richard Nixon, is a, Quaker)
(Quakers, are, pacifists)
(Richard Nixon, escalated, the Vietnam War)
Does the graph conclude that Nixon is a pacifist? Real-world logic isn't always straightforward, and AI can stumble over these nuances.

5. The Human vs. Machine Conundrum

Humans are flexible thinkers. We handle ambiguity, context, and exceptions with ease. Computers, on the other hand, need clear, structured data. When we try to force the richness of human language and knowledge into rigid formats, we lose something important.

The Database Dilemma

All these challenges highlight a broader issue: how we store and retrieve data for AI systems. Balancing the need for detailed, accurate information with the limitations of current technology isn't easy.

Embedding databases can become unwieldy as they grow. Knowledge Graphs can help organize information but may oversimplify complex concepts. We're still searching for the best way to bridge the gap between human language and machine understanding.

So, What Now?

RAG isn't a lost cause—it just isn't a one-size-fits-all solution. To make it work better, we might need to:
  • Develop Smarter Preprocessing: Clean and prepare text in ways that make it easier for AI to understand, like resolving pronouns and simplifying sentences.
  • Embrace Hybrid Approaches: Combine embeddings with other methods, like traditional search algorithms or domain-specific rules, to improve accuracy.
  • Accept Imperfection: Recognize that AI has limitations and set realistic expectations about what it can and can't do.

Final Thoughts

Retrieval Augmented Generation holds a lot of promise, but it's not a magic wand. By understanding its limitations and working to address them, we can build better AI systems that come closer to meeting our needs. It's an ongoing journey, and with each challenge, we learn more about how to bridge the gap between human knowledge and artificial intelligence.

Read More

When Retrieval Augmented Generation (RAG) Fails

11/25/2024
Retrieval Augmented Generation (RAG) sounds like a dream come true for anyone working with AI language models. The idea is simple: enhance models like ChatGPT with external data so...
Read more

SemDB: Solving the Challenges of Graph RAG

11/21/2024
In the beginning there was keyword search
Eventually word embeddings came along and we got Vector Databases and Retrieval Augmented...
Read more

Metagraphs and Hypergraphs with ProtoScript and Buffaly

11/20/2024
In Volodymyr Pavlyshyn's article, the concepts of Metagraphs and Hypergraphs are explored as a transformative framework for developing relational models in AI agents’ memory systems...
Read more

Chunking Strategies for Retrieval-Augmented Generation (RAG): A Deep Dive into SemDB's Approach

11/19/2024
In the ever-evolving landscape of AI and natural language processing, Retrieval-Augmented Generation (RAG) has emerged as a cornerstone technology...
Read more

Is Your AI a Toy or a Tool? Here’s How to Tell (And Why It Matters)

11/07/2024
As artificial intelligence (AI) becomes a powerful part of our daily lives, it’s amazing to see how many directions the technology is taking. From creative tools to customer service automation...
Read more

Stop Going Solo: Why Tech Founders Need a Business-Savvy Co-Founder (And How to Find Yours)

10/24/2024
Hey everyone, Justin Brochetti here, Co-founder of Intelligence Factory. We're all about building cutting-edge AI solutions, but I'm not here to talk about that today. Instead, I want to share...
Read more

Why OGAR is the Future of AI-Driven Data Retrieval

09/26/2024
When it comes to data retrieval, most organizations today are exploring AI-driven solutions like Retrieval-Augmented Generation (RAG) paired with Large Language Models (LLM)...
Read more

The AI Mirage: How Broken Systems Are Undermining the Future of Business Innovation

09/18/2024
Artificial Intelligence. Just say the words, and you can almost hear the hum of futuristic possibilities—robots making decisions, algorithms mastering productivity, and businesses leaping toward unparalleled efficiency...
Read more

A Sales Manager’s Perspective on AI: Boosting Efficiency and Saving Time

08/14/2024
As a Sales Manager, my mission is to drive revenue, nurture customer relationships, and ensure my team reaches their goals. AI has emerged as a powerful ally in this mission...
Read more

Prioritizing Patients for Clinical Monitoring Through Exploration

07/01/2024
RPM (Remote Patient Monitoring) CPT codes are a way for healthcare providers to get reimbursed for monitoring patients' health remotely using digital devices...
Read more

10X Your Outbound Sales Productivity with Intelligence Factory's AI for Twilio: A VP of Sales Perspective

06/28/2024
As VP of Sales, I'm constantly on the lookout for ways to empower my team and maximize their productivity. In today's competitive B2B landscape, every interaction counts...
Read more

Practical Application of AI in Business

06/24/2024
In the rapidly evolving tech landscape, the excitement around AI is palpable. But beyond the hype, practical application is where true value lies...
Read more

AI: What the Heck is Going On?

06/19/2024
We all grew up with movies of AI and it always seemed to be decades off. Then ChatGPT was announced and suddenly it's everywhere...
Read more

Paper Review: Compression Represents Intelligence Linearly

04/23/2024
This is post is the latest in a series where we review a recent paper and try to pull out the salient points. I will attempt to explain the premise...
Read more

SQL for JSON

04/22/2024
Everything old is new again. A few years back, the world was on fire with key-value storage systems...
Read more

Telemedicine App Ends Gender Preference Issues with AWS Powered AI

04/19/2024
AWS machine learning enhances MEDEK telemedicine solution to ease gender bias for sensitive online doctor visits...
Read more