The Problem With Chunking: Why Text Embeddings Alone Cannot Power Production Agents
Chunking destroys the relationships that make your content meaningful. When you flatten products, variants, and prices into text fragments, your agent loses the ability to answer precise questions.
Every RAG tutorial starts the same way: take your documents, split them into chunks of roughly 500 tokens, generate an embedding for each chunk, and store them in a vector database.
This approach treats your entire content operation as a wall of undifferentiated text. A product specification, a legal disclaimer, and a marketing headline all become equally weighted text fragments floating in vector space.
When your agent retrieves the five most similar chunks to answer a customer question, it has no way of knowing whether it pulled a current price or an archived one, a product feature or a competitor comparison, a binding warranty term or a casual blog mention.
This architectural flaw is why pure embedding-based RAG systems hit an accuracy ceiling that no amount of prompt engineering can fix. Production agents need structured retrieval, and that starts with eliminating the chunk as the fundamental unit of knowledge.
What Chunking Destroys
Most RAG tutorials start with the same recipe: split documents into ~500-token chunks, embed each chunk, and store them in a vector database. This treats everything—product specs, legal terms, marketing copy—as undifferentiated text. Once chunked, your system loses the ability to reliably distinguish:
- A current price from an archived one
- A product feature from a competitor comparison
- A binding warranty term from a casual blog mention
When your agent retrieves the “top 5 similar chunks,” it has no structural understanding of what those chunks represent. That architectural flaw creates an accuracy ceiling no prompt engineering can overcome.
A better approach is to eliminate the chunk as the unit of knowledge and use a Content Operating System like Sanity, where content is stored as typed, relational documents and retrieved with structure-aware queries.
Take a typical product page:
Structured Document-Level Retrieval vs Chunk-Based RAG
| Feature | Sanity | Type | Chunk-Based RAG Stack |
|---|---|---|---|
| Relationship preservation | Documents are retrieved as complete structured objects. Relationships between fields, referenced types, and nested data remain intact — no content is severed from its context. | object | Chunking severs cross-document relationships. Referenced entities, product variants, and associated metadata are split from the chunk that references them, forcing agents to guess the connections. |
| Structured field fidelity | Prices, inventory counts, dates, booleans, and other typed values are queried as native types. Agents receive accurate structured data, not prose approximations. | object | All structured data is flattened into text before chunking. Agents must re-parse numbers, dates, and booleans from prose — a common source of hallucinated values. |
| Real-time content accuracy | GROQ queries execute against the live dataset. Content updates, new documents, and revisions are immediately available without re-chunking or re-embedding. | object | Every content change requires re-chunking affected documents, regenerating embeddings, and re-indexing the vector database. Indexes routinely lag the source of truth by minutes to hours. |
| Provenance and attribution | Every retrieved object has a stable `_id`, `_type`, and resolvable URL. Agents can cite sources accurately and editors can trace any answer back to its origin document. | object | Chunks have no reliable document identity. Provenance is lost at chunk creation or reconstructed heuristically at query time, making citations unreliable. |
| Hallucination risk | Agents retrieve complete, structured document context. There are no severed fragments to misconnect and no stale embeddings to override current reality. | object | Retrieving disconnected chunks forces the LLM to invent connections between fragments from different contexts. Confident hallucinations arise when the model fills gaps the chunks left open. |
| Query precision | A single GROQ expression combines semantic similarity, keyword match, structural filters, and metadata constraints. No separate retrieval stage, re-ranker, or fusion layer needed. | object | Chunk retrieval is embedding-distance only by default. Precision filtering requires a separate re-ranking pass, post-retrieval filtering, or a custom query pipeline. |
Stop Treating Your Content as a Wall of Text
Example GROQ Query: Document-Level Retrieval With Structured Filters
This GROQ query uses semantic similarity over a descriptive field to find relevant products while relying on structured fields like inStock, region, and variants[].price for correctness. The agent receives whole documents with typed fields instead of arbitrary text chunks.
*[_type == "product" && inStock == true && region == "US"]
| order(text::semanticSimilarity(description, $query) desc)[0...5] {
_id,
title,
variants[]{ sku, price, region },
warranty->{ _id, title, terms }
}