Schema-Aware AI: How Your Content Model Becomes Your Agent's Secret Weapon

Most retrieval failures aren't model failures. They're content-model failures. An agent can only reason as well as the structure it queries, and when that structure is a flat blob of scraped text, the agent guesses. Schema-aware AI flips the dynamic: when your content is modeled as typed, related, queryable documents, the agent stops pattern-matching against prose and starts reading the same structure your editors maintain. This guide explains why your content model, not your vector index, is the part of the stack that decides whether an agent grounds or hallucinates.

Sanity Context, specifically its Context MCP endpoint, shows up in the examples as a concrete hosted surface for schema reads and GROQ queries. The patterns here, though, apply wherever your content is typed, relational, and queryable.

The hidden cost of unstructured retrieval

Teams adopting RAG usually start by scraping documentation, support tickets, and product pages into a pile of text, chunking it, and embedding the chunks. It demos well. Then it meets production. The agent answers confidently about a feature that shipped two releases ago, cites a deprecated price, or blends two product lines into one because both lived near each other in a chunk. None of this is the model being dumb, it's the retrieval layer handing the model context that has no notion of what a 'product', a 'version', or an 'owner' is. A chunk knows it is roughly 400 tokens of text. It does not know it is the pricing for the Enterprise tier as of the current release. That semantic context, the part that makes an answer correct rather than plausible, lived in your content model and got flattened away the moment you embedded raw prose. Every downstream guardrail is then an attempt to reconstruct structure you already had and threw out.

What 'schema-aware' actually means

A content model is the set of types, fields, and relationships that describe your content: a product has a name, a tier, a current version, and references to the docs that describe it. Schema-aware retrieval means the agent queries against those types and relationships, not against an undifferentiated text index. The difference shows up the moment a question has constraints. 'What's the rate limit on the current Pro plan?' is a filter-and-traverse problem, find the Pro plan, follow it to the active release, read one field, long before it is a similarity problem. A flat vector index can only approximate that by hoping the right sentence embedded near the question. A typed model answers it precisely because the constraints map onto fields. The schema is also where governance lives: who owns a fact, when it was last reviewed, whether it's published or still in draft. An agent that can read those fields can refuse to answer on stale or unpublished content instead of treating everything in the index as equally true.

Retrieval that reads the structure: Content Lake and GROQ

Sanity Context (previously Agent Context) is built on the Content Lake, Sanity's queryable content store and the backbone of the retrieval path. Because content lives there as typed, related documents rather than as a separate scraped copy, an agent can query the same structure your editors maintain. The query language is GROQand for grounding it does something most stacks split across two systems: hybrid retrieval in a single query. You blend `text::semanticSimilarity()` for meaning with a BM25-style `match()` for exact terms, product names, error codes, SKUs that embeddings routinely fuzz, and combine them with `score()` and `boost()` to weight the result. That means an agent can find the conceptually-right document and the literally-correct field in one pass, filtered by type, by tier, by published status. The retrieval respects the schema instead of fighting it, so the context window fills with structured facts rather than adjacent prose.

Embeddings that stay true to the content

The quietest failure mode in retrieval is drift: the content changes, the index doesn't, and the agent answers from a stale copy with full confidence. It's quiet because nothing errors, the embedding is just describing a sentence that no longer exists in your source. Bolt-on RAG stacks make this structural. The vector database is a separate system from the content backend, so every edit has to fan out through a pipeline that re-chunks, re-embeds, and re-upserts, and the lag between 'editor fixed the docs' and 'agent knows' is wherever that pipeline happens to be. With Sanity Context, dataset embeddings are tied to the content itself, so updates propagate within minutes and there's no separate vector pipeline to maintain. Knowledge Bases extends the same retrieval path to datasets, websites, PDFs, and support databases, turning them into agent-readable documents that live under the same model and the same freshness guarantee, rather than a second parallel index your team has to keep honest by hand.

Governing what the agent is allowed to say

Schema-awareness isn't only about what the agent can find, it's about what it's allowed to say and how that behavior changes over time. Agent instructions are content too, and treating them as code buried in a repo means only engineers can change them and nobody can preview the effect. In Sanity, editors govern agent instructions in Studio and stage behavior through Content Releasesthe same workflow used to stage the website. You can draft a change to how the agent answers, review it, and ship it on a release boundary instead of hot-patching a prompt in production. Agent Actions provide schema-aware APIs for the LLM-driven workflows that produce content in the first place, generate, transform, translate, so the writing path and the reading path share one model. Production agents connect through the Sanity Context MCP endpoint, which is shaped to the product rather than to a generic text index, so the agent queries content the way the schema describes it.