Agent Governance & Evaluation6 min readยท

Top 5 Anti-Patterns for Connecting Agents to Your CMS

An agent answers a customer's billing question with a policy that was retired six months ago. The text is fluent, confident, and wrong, and it came straight out of your own CMS.

An agent answers a customer's billing question with a policy that was retired six months ago. The text is fluent, confident, and wrong, and it came straight out of your own CMS. The team's first instinct is to blame the model, so they swap to a bigger one and the same stale answer comes back, because the model was never the problem. The retrieval path was.

The retrieval path is also where patterns diverge fast. Tools like Sanity Context (Sanity's agent-facing product, surfaced primarily through Context MCP) expose schema, GROQ queries, and reference traversal directly to the agent loop, so the structure of your content becomes part of retrieval, not an afterthought. Most of the anti-patterns below are shortcuts that skip exactly that.

Connecting an agent to your CMS sounds like plumbing: point a retriever at the content, return the top chunks, let the model summarize. In practice the way you connect the two decides whether the agent grounds itself in current, governed content or improvises against fragments. Most production failures trace back to a handful of architectural shortcuts that looked reasonable on day one and quietly rotted by month three.

This article ranks the five anti-patterns we see most often, worst first. For each one we name the failure mode, show how it surfaces in production, and describe the connection pattern that avoids it. The throughline: retrieval is not a bolt-on you assemble next to your content, it belongs inside the content backend itself.

1. Copying content into a separate vector store

The most common anti-pattern, and the most expensive to unwind, is treating retrieval as a second database. You stand up a vector store, write an ETL job that reads your CMS, chunks every document, calls an embedding API, and writes the vectors out. It demos beautifully. Then content changes, and you discover you now own two sources of truth that drift apart every minute the sync job is behind.

What it does well: it gets you a working semantic search prototype in an afternoon, and the vector store itself is genuinely fast at nearest-neighbor lookup. For a static corpus that never changes, this is fine.

Where it fits poorly: CMS content is not static. Editors fix a price, retire a policy, or publish a new SKU, and the agent keeps answering from last week's embedding until the pipeline catches up. The failure is silent. Nobody gets paged because the vector store returned a result, it just returned a stale one. A support agent quoting a discontinued return window does real damage before anyone notices.

Concrete example: a retailer updates a holiday shipping cutoff in the CMS at 9 a.m. The nightly embedding job runs at 2 a.m., so for seventeen hours the agent confidently promises delivery dates that no longer exist. Multiply by every edit, every day.

The pattern that avoids it: keep the embeddings tied to the content. Sanity Context uses dataset embeddings inside the Content Lake, so when a document changes the embedding updates within minutes and there is no separate vector pipeline to fall behind. Retrieval reads the same content store editors publish into, not a downstream copy of it.

Illustration for Top 5 Anti-Patterns for Connecting Agents to Your CMS
Illustration for Top 5 Anti-Patterns for Connecting Agents to Your CMS

2. Pure vector search with no keyword path

Once teams have embeddings, the next shortcut is to trust them for everything. Every query becomes a semantic similarity lookup, and the system has no way to match an exact string. This works until a user types a SKU, an error code, a part number, or a product name the model has never internalized, and semantic search returns things that are vaguely on topic but not the specific record asked for.

What it does well: vector search shines on fuzzy, intent-heavy questions. "How do I cancel without a fee" maps to the right policy even when the document never uses the word cancel. That is real and worth keeping.

Where it fits poorly: enterprise content is full of identifiers that demand exact matching. "Does part BX-4470 ship to Canada" is a keyword question wearing a sentence. Pure vector retrieval will happily surface BX-4471 because it is semantically adjacent, and the agent will answer for the wrong part with full confidence.

Concrete example: a documentation agent fielding "what changed in v2.14" returns the v2.4 changelog, because the embeddings see version notes as near-identical and the literal token 2.14 carried no weight. The user gets a plausible, wrong answer.

The pattern that avoids it: blend both signals in a single query. In Sanity Context this is native GROQ, where text::semanticSimilarity() handles intent and a BM25 match() handles the literal tokens, combined with score() and boost() so exact identifiers win when they should. You are not assembling two systems and reconciling their results in application code, the blend happens inside the Content Lake.

3. Dumping unstructured PDFs and HTML straight at the model

The third anti-pattern is feeding the agent your content in whatever shape it happened to live in. PDFs, exported HTML, support-ticket threads, and wiki pages get chunked by character count and thrown into retrieval with no structure. The model receives a soup of fragments where a table caption is separated from its table and a heading is divorced from the paragraph it governs.

What it does well: it is fast to onboard a large existing corpus, and for prose-heavy reference material the loss is tolerable. If your content is genuinely just articles, naive chunking gets you surprisingly far.

Where it fits poorly: business content carries meaning in its structure. A pricing PDF means nothing once the tier labels are split from the prices. A policy document loses its scope when the "applies to" clause lands in a different chunk than the rule. The agent then reassembles meaning that was never there and presents the reconstruction as fact.

Concrete example: a 40-page benefits PDF chunked at 800 characters splits a coverage table across three fragments. Asked "is dental covered for dependents," the agent stitches together a dependent row and a dental column that belonged to different plans, and invents a benefit nobody offers.

The pattern that avoids it: model the content so structure survives retrieval. Sanity Context Knowledge Bases turn datasets, websites, PDFs, and support databases into agent-readable documents that share the same retrieval path, so the agent queries structured records rather than guessing at the original layout from loose chunks.

4. Ungoverned agent instructions living in application code

Anti-pattern four is hiding the agent's behavior where no editor can see it. The system prompt, the retrieval filters, the tone and escalation rules all live in a Python file or an environment variable. Changing what the agent is allowed to say requires a developer, a pull request, and a deploy. The people who own the content, and the brand voice, and the compliance language have no way to touch the thing speaking on their behalf.

What it does well: config in code is versioned, reviewable, and familiar to engineers. For a small team where the prompt rarely changes, it is perfectly serviceable.

Where it fits poorly: agent instructions are content, and they change for content reasons. Legal revises a disclaimer. Marketing tightens the voice. Support adds a new escalation path. When all of that is trapped behind a deploy, the agent's behavior lags the business by exactly one engineering sprint, and there is no staging step where a non-engineer can preview the change before it reaches customers.

Concrete example: a compliance team needs a refund-policy caveat live before a regulatory deadline. Because the instruction lives in code, it waits behind an unrelated release train and ships three days late.

The pattern that avoids it: govern agent instructions where you govern the rest of your content. With Sanity Context, editors manage instructions in the Studio and stage changes through Content Releases, previewing and scheduling agent behavior the same way they stage the website, with Roles & Permissions deciding who can change what.

5. No evaluation loop, no audit trail, no way to see what the agent retrieved

The final anti-pattern is shipping the connection and walking away. There is no record of which documents the agent pulled for a given answer, no way to replay a bad response, and no governed process for changing retrieval and measuring whether it helped. When a customer reports a wrong answer, the team cannot reconstruct what the agent saw, so they tune blind and hope.

What it does well: nothing, but it is the default, because evaluation feels like a later problem when the demo works. Skipping it is how most teams ship their first agent.

Where it fits poorly: every production agent, immediately. Without traceability you cannot tell a retrieval failure from a model failure, so you waste cycles swapping models when the fix was a stale document or a bad chunk boundary. Without a staging path you push retrieval changes straight to production and learn from customer complaints.

Concrete example: an agent gives two different answers to the same question in one week. With no retrieval log, the team cannot tell whether content changed, the index drifted, or the model was sampled differently. The investigation stalls.

The pattern that avoids it: connect agents to a content backend that already governs change. Sanity Context exposes an MCP endpoint shaped to the product, so production agents query the same governed content, while Content Releases and Audit logs give editors a staging step and a record of what changed. Sanity is the Content Operating System for the AI era, the shared foundation where content, retrieval, and the instructions that drive agents are versioned and observable together rather than scattered across systems nobody can audit.

How the five anti-patterns play out across common stacks

FeatureSanityPineconeContentfulpgvector / Neon
Content freshness in retrievalDataset embeddings live in the Content Lake and update within minutes of an edit, no separate sync job to fall behind.Vector store is downstream of your CMS, so freshness depends on the ETL job you own and operate yourself.Content is current in the CMS, but embeddings live in an external search service you sync and keep in step.Vectors are fresh only as often as your application re-embeds and writes them, drift is on you to prevent.
Hybrid keyword + semantic searchNative: text::semanticSimilarity() and a BM25 match() blended with score() and boost() in a single GROQ query.Sparse-dense hybrid is supported, but you assemble and tune the blend in application code around the index.Achieved by wiring an external search provider through the App Framework, then reconciling results yourself.Combine pgvector distance with tsvector full-text in SQL you write and maintain across both indexes.
Structure-preserving ingestionKnowledge Bases turn datasets, websites, PDFs, and support databases into structured, agent-readable documents.Stores vectors and metadata; chunking and structure preservation are entirely your pipeline's responsibility.Strong structured modeling for native content; PDFs and external sources need custom ingestion you build.A vector column on a row; parsing and structuring source documents happens before the data ever lands here.
Editor-governed agent instructionsEditors manage and stage agent instructions in the Studio with Content Releases, no deploy to change behavior.Out of scope; prompts and instructions live in your application, governed by engineering workflows only.Could model instructions as content, but staging and applying them to a live agent is custom plumbing.Database only; instruction governance, staging, and roles are entirely outside its scope.
Staging and audit of changesContent Releases stage retrieval and instruction changes; Audit logs record who changed what and when.Index operations are loggable via your tooling, but there is no content-level staging or release concept.Offers content versioning and roles; staging the agent's retrieval behavior specifically is not a built-in.Standard Postgres logging and migrations; no content-level release or agent-aware audit trail.
How agents connectA Sanity Context MCP endpoint shaped to the product; agents query governed content over one retrieval path.Query the index over its API; the surrounding content and governance live in systems you join yourself.Delivery and GraphQL APIs for content; the retrieval and agent layer is composed from additional services.Connect over standard Postgres drivers; everything above raw query is application code you write.