In-House Knowledge Base for Specialty Billing

Overview

I built this project because ChatGPT was becoming a problem. Not because it’s useless—quite the opposite—it’s too good at confidently saying things that sound correct. In the world of specialty billing, especially for Medicare and insurer-specific rules, that can mean confidently filing incorrect claims. I started seeing this more and more with our billing partner, Medek Provider Network. Billers would argue about the right way to handle something, and it turned out they were quoting competing versions of ChatGPT like two LLMs debating through human proxies.

This isn’t the future we want. At Intelligence Factory, our billing automation system (FairPath) needs one source of truth. That’s why I built an internal knowledge base—grounded, curated, and semantically indexed—to serve as the backbone for how our systems understand medical billing.

‍

The Problem

Specialty billing is dense, exception-driven, and constantly changing. Our internal documents go back years and reflect hard-won experience—millions of claims processed and thousands of rules verified, tweaked, or annotated by actual professionals. But none of that matters if your retrieval layer is brittle, or worse, if your AI just makes things up.We needed a system that could:

Accept ongoing input from real billers and clinicians.
Handle updates without invalidating embeddings or introducing drift.
Map informal guidance back to structured policies.
Surface precise answers, grounded in real data, with traceability.

And we needed to do all of it without sacrificing flexibility, accessibility, or speed.

‍

How It Works

The entire system is built on top of SemDB, our semantic database infrastructure.We start with ingestion, but unlike most RAG systems, we didn’t stop at “dump it in and chunk it.” First, we support live ingestion from multiple real-world sources: Google Docs, emails, attachments, and even conversational correction—if an experienced biller catches a mistake, they can submit a correction just by chatting with the system.

What happens after that is where most of the work lives:

Chunking: We don’t use fixed-length chunks. Instead, we apply context-based chunking, drawing inspiration from Anthropic’s techniques but pushing it further. Chunks respect section boundaries, semantic scope, and document structure.
Preprocessing: Before embedding, we rewrite text. Sentences are simplified, pronouns are resolved, and passive constructions are clarified.
Hierarchical Embedding: Embeddings are generated at multiple granularities—paragraphs, sections, documents—so the system can retrieve with the right level of specificity for each query.
Ontology Construction: This is what sets us apart. We build an ontology on top of the document set, tagging relationships, codes, exceptions, and cross-references. If a CPT code is mentioned alongside a billing modifier and a rule about documentation requirements, that linkage is preserved.

‍

Retrieval and Execution

When a query comes in—either from a human or from a downstream system like Buffaly—it’s resolved semantically against both the ontology and the vector index. OGAR (Ontology-Guided Augmented Retrieval) lets us balance structure with similarity: ontologies for precision, embeddings for context.Results are retrieved and scored. Then, and only then, does the generator step in. The final answer is composed using a standard LLM pipeline, but all outputs are grounded—every assertion links back to a source document. There’s no free text generation without backing. It’s not optional.

‍

Impact and Outcome

This system powers FairPath’s automated claims workflows. It’s the reason we can say, with confidence, that our pre-authorization logic and CPT code mappings are auditable and correct. It also serves as a live resource for billers—giving them confidence that what they’re reading isn’t AI speculation but vetted knowledge, curated by people who do this work every day.We’ve built a living, growing body of billing intelligence. It evolves as the rules do. And every component of the system—ingestion, retrieval, generation—is traceable, testable, and safe.

‍

If You’re Building Anything That Depends on Regulatory Precision

Don’t rely on text-to-text LLMs. Build the semantic layer. Create grounding. Make it editable by people who know the edge cases. That’s what we did. And now our systems—and our clients—are safer, faster, and more accurate because of it.