AI agents are increasingly used in real-time customer service conversations, but as adoption increases, so do the risks of low-quality output: hallucinated facts, misinterpreted inputs, and unresolved objectives. These aren’t abstract concerns—they erode trust, create confusion, and often result in unmet customer needs.
This project was designed to solve exactly that: we built a real-time hallucination detection and correction pipeline for AI Voice Agents inside Feeding Frenzy CRM. The system runs on Buffaly, our in-house implementation of OGAR (Ontology-Guided Augmented Retrieval), and operates at runtime to monitor AI behavior, catch mistakes, and redirect the conversation.
AI hallucinations are common in generative dialogue systems. They manifest as unsupported claims, incorrect assumptions, or confident answers to questions that were never asked. In real-time voice contexts, this is compounded by issues like:
In short, conversations fail in subtle but critical ways. When this happens during a support call, it leads to dissatisfaction, failed automation, and the very problems AI was supposed to fix.
We built a real-time hallucination detection system by embedding OGAR directly into Feeding Frenzy Voice Agent pipelines via Buffaly.Buffaly continuously monitors AI-agent conversations in real time, structured around three key capabilities:
A real-world customer service scenario highlights this clearly.The agent is supposed to gather the customer’s name and location before dispatch. However, the customer has a thick Florida accent, and the voice transcription is ambiguous. The AI agent confidently responds using incorrect names (“Thank you, Paul Johnson”) and proceeds to the next objective, even though it never received a last name.
Buffaly intercepts this, extracts the agent’s assumptions, and checks for supporting data. It finds none—flagging hallucinations and identifying the objective as incomplete. It then prompts the agent to clarify instead of moving forward.Later in the call, the user says “I’m a watermelon,” and the agent interprets it as “Mims, Florida.” Buffaly uses ontology-based reasoning to identify that this is neither supported by the audio nor by the conversation logic. It corrects the location, and redirects the agent to confirm it.
Buffaly reduced hallucinations and unmet objectives by over 30% across trial deployments. More importantly, it allowed customer service AI to behave with higher levels of integrity. Conversations that would have previously continued with incorrect assumptions were corrected in-flight.Agents were no longer "just sounding confident"—they were provably correct, or they asked for clarification.
This project represents a shift away from generative freeform responses and toward structured, supervised, ontology-aware automation. Buffaly ensures that AI agents don’t invent facts, don’t skip steps, and don’t hide behind language models.All of this runs locally, in real time, on top of structured ontologies tailored to the domain—whether that’s roadside assistance, billing, or insurance.
We’re expanding Buffaly’s hallucination detection framework to support multimodal interactions and expanding domain ontologies for deployment into finance, healthcare, and telecom.If you’re shipping AI into production environments and hallucinations are still just a model tuning problem, you’re not solving the right problem. OGAR is how we address it. Buffaly is how we run it.