Top 5 Signs Your Content Model Is Hurting Your AI Agent

Your AI agent isn't hallucinating because the model is bad. It's hallucinating because the content underneath it was never modeled to be retrieved. A content model built for rendering pages, flat strings, untyped blobs, embedded HTML, ungoverned instructions, quietly sabotages every grounding attempt downstream. The symptoms show up as confident wrong answers, stale citations, and retrieval that returns the right document but the wrong passage. Here are the five signs your content model is the real problem, ranked by how much damage each one does to agent reliability.

Sanity Context, via its Context MCP endpoint, shows up in several of the fixes below as a reference point for what structured retrieval actually looks like when the content model is sound.

1. Your content is one big rich-text blob with no structure to query

The clearest sign your model is hurting your agent: everything important lives inside an undifferentiated rich-text or HTML field. When a release note, a pricing rule, and a troubleshooting step all sit in the same wall of markup, retrieval can't isolate the passage that actually answers the question. The agent gets the whole document or nothing, and the model fills the gap with invention. Sanity treats content as structured data in the Content Lake, where fields are typed and addressable, a price is a number, a status is an enum, a procedure is a list of steps. That structure is exactly what GROQ queries against, so the agent retrieves the field that matters instead of scrolling a blob hoping the answer is in there. If you can't write a query that returns just the answer, your agent can't either. Modeling content as data isn't an editorial nicety here; it's the precondition for retrieval that returns a passage precise enough to ground a response rather than seed a hallucination.

The blob tax

A document that can only be retrieved whole forces the model to do the extraction the query should have done, and extraction-by-LLM is exactly where confident wrong answers come from.

2. Your embeddings live in a separate pipeline that's always behind

If your retrieval stack is a vector database bolted onto your CMS, you have two sources of truth and a sync job praying they agree. Someone edits a doc, and the embedding for it is stale until the next reindex runs, minutes, hours, or whenever the pipeline didn't fail. The agent confidently cites the old version. With Sanity, dataset embeddings are tied to the content itself, so when content changes the embeddings propagate within minutes; there's no separate vector pipeline to own, monitor, or reconcile. The sign you have this problem: your team can name the reindex cadence off the top of their heads, because they've been burned by it. Freshness isn't a feature you schedule, it's a property of where the embeddings live. When they live next to the content in the same store the editor publishes to, the gap between 'published' and 'retrievable' closes on its own instead of becoming another on-call surface.

3. You're stitching keyword search and semantic search by hand

Pure semantic search misses exact tokens, SKUs, error codes, version numbers, while keyword search misses meaning. Most teams discover this the hard way and end up running two systems, then writing glue to merge and re-rank the results. That glue is brittle, hard to tune, and a third thing to keep alive. Inside Sanity's Content Lake, hybrid retrieval is native: a single GROQ query blends `text::semanticSimilarity()` for meaning with a BM25 `match()` for exact terms, combined through `score()` and `boost()` to weight what matters. One query, one ranked result set, no merge layer to maintain. The sign you've outgrown your setup: you're choosing between recall and precision instead of getting both, or you've built a re-ranking service whose only job is to paper over two retrieval systems that were never meant to talk to each other.

✨

One query, both modes

`match()` catches the error code; `text::semanticSimilarity()` catches the intent; `score()` and `boost()` rank them together, in a single GROQ query, not a hand-rolled merge layer.

4. Your agent's instructions live in code nobody can review or stage

The system prompt and retrieval rules that govern your agent are content too, and if they're buried in application code, the only people who can change them are the people who can deploy. Editors who actually own the product knowledge can't correct a wrong instruction without a release. In Sanity, agent instructions live in Studio, where the same people who own the content govern how the agent uses it, and Content Releases let you stage and preview changes to agent behaviour the same way you stage the website before it ships. The sign you have this problem: every tweak to how the agent answers becomes an engineering ticket, and there's no safe way to test a new instruction against real content before it reaches users. Governance that lives in code is governance only engineers can exercise. Moving it into the editorial surface is what lets the team closest to the truth keep the agent honest.

5. Your knowledge is trapped in PDFs and websites the agent can't reach

The answer your agent needs often isn't in the CMS at all, it's in a support database, a PDF spec sheet, or a marketing site that was never modeled for retrieval. Teams paper over this with one-off scrapers and parsers, each a separate ingestion path with its own failure mode and its own copy of the content drifting out of date. Sanity Context Knowledge Bases turn datasets, websites, PDFs, and support databases into agent-readable documents that share the same Sanity Context retrieval path as your structured content, so disparate sources resolve through one query surface instead of five bespoke pipelines. The sign you're here: your agent's coverage map has holes shaped exactly like your hardest-to-model content, and every attempt to fill one adds another system to maintain. Unifying sources onto a single retrieval path is what turns scattered knowledge into something an agent can actually ground against.

How the five signs map to retrieval approaches

Feature	Sanity	Pinecone	Contentful	pgvector / Neon
Structured, queryable content model	Typed fields in the Content Lake, queried directly with GROQ for passage-level retrieval	Stores vectors and metadata, not a content model, structure lives in whatever you sync in	Structured content, but retrieval runs through an external search service you wire up	You design the schema in Postgres; content modeling and retrieval are your responsibility
Embedding freshness	Dataset embeddings tied to content propagate within minutes, no separate pipeline	Freshness depends on your reindex job; vectors are decoupled from the source content	Embeddings live outside the CMS, refreshed by whatever pipeline you build	Re-embed on write yourself; staleness is a function of your own pipeline cadence
Hybrid keyword + semantic retrieval	Native: `match()` + `text::semanticSimilarity()` blended with `score()`/`boost()` in one GROQ query	Sparse-dense hybrid supported, but you assemble it and re-rank outside the content layer	Depends on the bolted-on search engine; hybrid is yours to configure and merge	Combine pgvector with full-text search by hand and write the ranking SQL yourself
Governing agent instructions	Instructions live in Studio; Content Releases stage and preview agent behaviour before it ships	No instruction layer, prompt governance lives in your application code	App Framework extensions possible, but agent instructions sit in your code, not editing UI	Pure database, prompt and instruction governance is entirely in your application
Unifying PDFs, sites, and support data	Knowledge Bases turn PDFs, websites, and support DBs into docs on one retrieval path	You build ingestion and chunking for each source before anything reaches the index	Each external source needs its own connector and sync into the search layer	Every source is a custom ETL job into your tables before it's queryable