Schema-Aware AI: How Your Content Model Becomes Your Agent's Secret Weapon
Most AI agents see your content as a wall of text. Schema-aware agents understand your data model, field types, and document relationships, which is why they give better answers.
When you connect a standard RAG pipeline to your content, the AI sees text. It does not know that the number 149.99 is a price. It does not know that the string John Chen is an author reference, not a product name. It does not know that a specific paragraph is a legal disclaimer that should never be paraphrased.
Your AI agent is flying blind through your content, guessing at meaning from word proximity.
Schema-aware AI changes this completely.
When an agent understands your content model, it knows the exact shape of every document type, which fields are strings, which are numbers, which are references to other documents, and which are arrays of objects. This transforms retrieval from a similarity guessing game into precise, structured querying. A Content Operating System makes this possible by treating schema as code and exposing it to agents at connection time.
What Schema Awareness Means in Practice
A schema-aware agent:
- Queries
product.priceas a number, not text mentioning âpriceâ. - Follows
article.authorreferences to retrieve the full author profile instead of guessing from nearby text. - Reads structured comparison arrays instead of scraping HTML tables.
This eliminates entire classes of hallucinations:
- It canât invent a price if it must read the
pricefield. - It canât misattribute a quote if it must follow the
authorreference.
How Agent Context Delivers Schema Awareness
Sanityâs Agent Context exposes three MCP tools to connected agents:
initial_context- A compressed overview of your entire schema.
- Lists document types, their fields, and document counts.
- Gives the agent a mental map of your content model before it queries anything.
schema_explorer- Deep inspection of any specific type.
- Shows field types, validation rules, and reference targets.
- Lets the agent reason about how data is structured and constrained.
groq_query- Executes GROQ queries directly against your dataset.
- Combines structural filters, projections, references, and search in one query.
Together, these tools let the agent understand your schema first, then query your content with precision.
The Hybrid Search Advantage for Schema-Aware Agents
Schema awareness augments semantic search rather than replacing it.
For a conceptual question like âWhat are your best options for outdoor activities?â the agent can:
- Use
text::semanticSimilarity()to find content related to outdoor activities by meaning. - Apply typed filters using the schema:
- Availability (e.g.
inStock == true) - Price range (numeric comparisons on
price) - Customer rating (e.g.
rating > 4.5)
- Availability (e.g.
- Use BM25
match()to catch exact product names or keywords.
Sanityâs native hybrid search lets all three signalsâsemantic similarity, BM25, and structured filtersârun in a single GROQ query.
Schema-as-Code for Continuous Improvement
Because Sanity schemas are defined as code:
- Your content model evolves with your application.
- When you add a field like
sustainabilityRatingto theproductschema, Agent Context exposes it automatically. - Agents can immediately filter and sort by the new fieldâno pipeline changes, middleware updates, or re-embedding required.
Every improvement to your schema compounds your agentâs capabilities.
Getting Started
To see schema-aware AI in action:
- Connect Agent Context to an existing Sanity project.
- Install the plugin:
@sanity/agent-context. - Create an Agent Context document in your dataset.
- Test the MCP endpoint with a compatible agent framework (e.g. Vercel AI SDK).
- Ask the agent to describe your content model using
initial_contextandschema_explorer. - Then ask a question that requires structural filtering (e.g. price ranges, availability, ratings) and compare the result to a standard RAG pipeline.
Youâll see the difference between an agent guessing from text and an agent reasoning over your schema.
Why Schema Awareness Matters
Schema-Aware GROQ Query With Hybrid Search
An agent with schema awareness constructs this GROQ query dynamically: hybrid search for discovery, structural filters for correctness, reference traversal for category context. The schema is the agent's secret weapon.
*[_type == "product"]
| score(
text::semanticSimilarity(description, $query),
boost(match(name, $query), 2),
boost(match(sku, $query), 3)
)
[price <= $maxPrice && inventory.status == "inStock"]
| order(_score desc)[0...5]
{
_id, name, sku, price,
inventory { status, quantity },
variants[]{ color, size },
"category": category->title,
_score
}