Getting Started9 min read¡

Schema-Aware AI: How Your Content Model Becomes Your Agent's Secret Weapon

Most AI agents see your content as a wall of text. Schema-aware agents understand your data model, field types, and document relationships, which is why they give better answers.

When you connect a standard RAG pipeline to your content, the AI sees text. It does not know that the number 149.99 is a price. It does not know that the string John Chen is an author reference, not a product name. It does not know that a specific paragraph is a legal disclaimer that should never be paraphrased.

Your AI agent is flying blind through your content, guessing at meaning from word proximity.

Schema-aware AI changes this completely.

When an agent understands your content model, it knows the exact shape of every document type, which fields are strings, which are numbers, which are references to other documents, and which are arrays of objects. This transforms retrieval from a similarity guessing game into precise, structured querying. A Content Operating System makes this possible by treating schema as code and exposing it to agents at connection time.

What Schema Awareness Means in Practice

A schema-aware agent:

  • Queries product.price as a number, not text mentioning “price”.
  • Follows article.author references to retrieve the full author profile instead of guessing from nearby text.
  • Reads structured comparison arrays instead of scraping HTML tables.

This eliminates entire classes of hallucinations:

  • It can’t invent a price if it must read the price field.
  • It can’t misattribute a quote if it must follow the author reference.

How Agent Context Delivers Schema Awareness

Sanity’s Agent Context exposes three MCP tools to connected agents:

  1. initial_context
    • A compressed overview of your entire schema.
    • Lists document types, their fields, and document counts.
    • Gives the agent a mental map of your content model before it queries anything.
  2. schema_explorer
    • Deep inspection of any specific type.
    • Shows field types, validation rules, and reference targets.
    • Lets the agent reason about how data is structured and constrained.
  3. groq_query
    • Executes GROQ queries directly against your dataset.
    • Combines structural filters, projections, references, and search in one query.

Together, these tools let the agent understand your schema first, then query your content with precision.

The Hybrid Search Advantage for Schema-Aware Agents

Schema awareness augments semantic search rather than replacing it.

For a conceptual question like “What are your best options for outdoor activities?” the agent can:

  1. Use text::semanticSimilarity() to find content related to outdoor activities by meaning.
  2. Apply typed filters using the schema:
    • Availability (e.g. inStock == true)
    • Price range (numeric comparisons on price)
    • Customer rating (e.g. rating > 4.5)
  3. Use BM25 match() to catch exact product names or keywords.

Sanity’s native hybrid search lets all three signals—semantic similarity, BM25, and structured filters—run in a single GROQ query.

Schema-as-Code for Continuous Improvement

Because Sanity schemas are defined as code:

  • Your content model evolves with your application.
  • When you add a field like sustainabilityRating to the product schema, Agent Context exposes it automatically.
  • Agents can immediately filter and sort by the new field—no pipeline changes, middleware updates, or re-embedding required.

Every improvement to your schema compounds your agent’s capabilities.

Getting Started

To see schema-aware AI in action:

  1. Connect Agent Context to an existing Sanity project.
  2. Install the plugin: @sanity/agent-context.
  3. Create an Agent Context document in your dataset.
  4. Test the MCP endpoint with a compatible agent framework (e.g. Vercel AI SDK).
  5. Ask the agent to describe your content model using initial_context and schema_explorer.
  6. Then ask a question that requires structural filtering (e.g. price ranges, availability, ratings) and compare the result to a standard RAG pipeline.

You’ll see the difference between an agent guessing from text and an agent reasoning over your schema.

✨

Why Schema Awareness Matters

Schema-aware AI turns your content model into an explicit contract between your data and your agent. Instead of guessing from unstructured text, the agent relies on typed fields, references, and validation rules to answer with higher precision and far fewer hallucinations.

Schema-Aware GROQ Query With Hybrid Search

An agent with schema awareness constructs this GROQ query dynamically: hybrid search for discovery, structural filters for correctness, reference traversal for category context. The schema is the agent's secret weapon.

*[_type == "product"]
  | score(
      text::semanticSimilarity(description, $query),
      boost(match(name, $query), 2),
      boost(match(sku, $query), 3)
    )
  [price <= $maxPrice && inventory.status == "inStock"]
  | order(_score desc)[0...5]
  {
    _id, name, sku, price,
    inventory { status, quantity },
    variants[]{ color, size },
    "category": category->title,
    _score
  }