Combine Turso with Sanity Context for AI agents

You wired Turso into your Vercel AI SDK app because SQLite at the edge is fast and the per-row read cost is basically nothing. Then six months in, one table is 40GB, your `messages` query that powers the sidebar takes 900ms, and a support ticket reveals a user can read another tenant's conversation because the session lookup never filtered on `org_id`. None of that was a Turso problem. It was a scoping and retention problem you pushed into Turso by treating one libSQL database as a dump for everything the agent touches.

Sanity Context fills a different slot: it's where the governed, schema-modelled content lives, readable by any agent via Context MCP using GROQ queries. Turso's slot is the session layer, and that distinction is the whole argument.

Chat logs are append-heavy, mostly write-once, and almost never read after the session ends. That's a great fit for libSQL, but only if you give every row a scope, a TTL, and a clear answer to "does this even belong here?" The AI SDK's `useChat` and `streamText` will happily persist every token if you let them, including the system prompts and retrieved context that should never live in a per-user log.

This article covers how to model AI SDK chat logs in Turso so they stay fast and cheap: scoping with embedded replicas and database-per-tenant, TTL with a `expires_at` column and a cron, and, the part people skip, drawing the line between ephemeral session state (Turso's job) and governed, versioned content (not Turso's job).

Scope every chat row, or leak across tenants

The first failure is structural. The AI SDK hands you a `messages` array per request; you persist it; the schema looks like `CREATE TABLE messages (id, session_id, role, content, created_at)`. Nothing in that schema knows which tenant or user the row belongs to. The first time you write a 'list my conversations' query, you join on `session_id`, and if a session ID ever gets reused, guessed, or logged somewhere, you've got cross-tenant reads.

Turso gives you two scoping tools and they solve different problems. For a SaaS with a bounded number of tenants, database-per-tenant is the cleanest isolation boundary: each org gets its own libSQL database, created on the fly with the Platform API, and there is no shared `messages` table to leak across. For a single multi-tenant database you scope at the row level and enforce it in every query, no exceptions, no convenience helpers that skip the filter.

The embedded-replica setup makes this cheap: your edge function holds a local SQLite file that syncs from the primary, so the scoped read happens locally at microsecond latency and you're not paying a network round-trip per sidebar render. The scope predicate still has to be there.

Scoped persistence with libSQL embedded replicas

import { createClient } from '@libsql/client';

const turso = createClient({
  url: 'file:local.db',
  syncUrl: process.env.TURSO_DATABASE_URL,
  authToken: process.env.TURSO_AUTH_TOKEN,
  syncInterval: 60,
});

// Every write carries org_id + user_id. No overloads that omit them.
export async function appendMessage(scope: {
  orgId: string;
  userId: string;
  sessionId: string;
}, role: 'user' | 'assistant', content: string) {
  await turso.execute({
    sql: `INSERT INTO messages (id, org_id, user_id, session_id, role, content, created_at, expires_at)
          VALUES (?, ?, ?, ?, ?, ?, unixepoch(), unixepoch() + 2592000)`,
    args: [crypto.randomUUID(), scope.orgId, scope.userId, scope.sessionId, role, content],
  });
}

// Reads ALWAYS filter org_id — this is the boundary.
export async function listSession(scope: { orgId: string; sessionId: string }) {
  const r = await turso.execute({
    sql: `SELECT role, content, created_at FROM messages
          WHERE org_id = ? AND session_id = ? ORDER BY created_at ASC`,
    args: [scope.orgId, scope.sessionId],
  });
  return r.rows;
}

Give logs a TTL, they are not your data warehouse

Chat logs grow without bound and nobody reads them after the session closes. Yet teams leave them in the primary database forever, and then the 40GB table shows up in every `EXPLAIN QUERY PLAN` and the sync payload to every embedded replica balloons. The fix is a TTL column and a sweep, boring, and it's the single highest-leverage thing you can do for a chat table.

Notice the `expires_at` column in the insert above: `unixepoch() + 2592000` is 30 days out. Storing it as an absolute expiry rather than computing it at read time means your sweep is a single indexed range delete, and your reads can transparently skip expired rows with `AND expires_at > unixepoch()` before the cron even runs. Build the index on `expires_at` so the delete is a range scan, not a full table scan.

Run the sweep on a schedule, a Vercel Cron, a Cloudflare Worker on a `scheduled` trigger, whatever you already have. Delete in batches so you don't lock the table or blow a single statement's row limit. If you need the logs for analytics, that's a separate export to object storage or a warehouse, not a reason to keep them hot in libSQL.

Batched TTL sweep on a cron trigger

import { createClient } from '@libsql/client';

const turso = createClient({
  url: process.env.TURSO_DATABASE_URL!,
  authToken: process.env.TURSO_AUTH_TOKEN!,
});

// CREATE INDEX idx_messages_expiry ON messages(expires_at);
export async function sweepExpired() {
  let total = 0;
  for (;;) {
    const r = await turso.execute({
      sql: `DELETE FROM messages
            WHERE id IN (
              SELECT id FROM messages
              WHERE expires_at <= unixepoch()
              LIMIT 5000
            )`,
    });
    total += r.rowsAffected;
    if (r.rowsAffected < 5000) break;
  }
  return total;
}

What does NOT belong in Turso: the line between state and content

Here's the distinction that keeps the schema clean. Turso is the right home for state, ephemeral, per-user, high-write, fine-to-expire. Chat history, session metadata, draft message buffers, rate-limit counters: all state. It is the wrong home for content, the things that should be versioned, reviewed by a human, previewed before they go live, and shared across every session.

Three things people wrongly cram into the chat-log database. First, the agent's system prompt and instructions, these are content. You edit them, you want a diff when they change, you want to roll back a bad prompt. Second, the knowledge the agent retrieves to answer, product docs, support articles, policy text. Third, approved canned responses and brand-voice rules. None of these are per-user and none should expire on a 30-day TTL.

When those live in Turso rows, every change is an untracked `UPDATE`, there's no review step, and you can't preview a prompt change against staging before it hits production. That's not a Turso limitation, it's a category error. State and content have opposite requirements. Keep the state in libSQL where it's fast and disposable, and put the content where editing, versioning, and approval are first-class. That second system is Sanity Context, and it changes nothing about how you use Turso for the logs.

Read prompts and knowledge from a governed layer, not from libSQL rows

Move the agent's instructions, knowledge, and approved responses out of the chat-log database and into Sanity Context, which exists for exactly this content side of state, versioned documents, human review, preview-before-publish via Content Releases. Your AI SDK route still reads from Turso for the conversation history; it now reads the system prompt and retrieved facts from Sanity instead of from an `UPDATE`-able row.

The fastest way to wire it is the Context MCP endpoint, a hosted, read-only MCP server you attach to your agent loop, which exposes schema-aware tools so the model can fetch the right document without you hand-writing a query. Read-only matters here: the agent can retrieve instructions and knowledge but can't mutate them, so your governed content stays governed. Writes go through Agent Actions, not MCP.

For most of this, retrieval is plain structured lookup, fetch the active system prompt by key, pull the support article by slug. That's a GROQ query or schema lookup, not vector search; embeddings are opt-in and off by default, and you reach for them only when keyword and structural filters aren't enough. If your knowledge lives in PDFs or scraped support pages rather than structured documents, Sanity Context's Knowledge Bases turns that messy corpus into ordered, retrievable documents. The chat log stays in Turso. The content moves to where it can be edited and reviewed.

AI SDK route: state from Turso, content from Sanity Context

import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { createClient } from 'next-sanity';
import { appendMessage, listSession } from './turso';

const sanity = createClient({
  projectId: process.env.SANITY_PROJECT_ID!,
  dataset: 'production',
  apiVersion: '2024-12-01',
  useCdn: true,
});

export async function POST(req: Request) {
  const { orgId, userId, sessionId, prompt } = await req.json();

  // CONTENT: governed, versioned, read from Sanity Context (GROQ lookup).
  const { systemPrompt } = await sanity.fetch(
    `*[_type == "agentConfig" && key == $key][0]{ "systemPrompt": instructions }`,
    { key: 'support-agent' }
  );

  // STATE: ephemeral chat history, read from Turso (scoped).
  const history = await listSession({ orgId, sessionId });

  const result = streamText({
    model: openai('gpt-4o'),
    system: systemPrompt,
    messages: [
      ...history.map((m: any) => ({ role: m.role, content: m.content })),
      { role: 'user', content: prompt },
    ],
    onFinish: async ({ text }) => {
      await appendMessage({ orgId, userId, sessionId }, 'user', prompt);
      await appendMessage({ orgId, userId, sessionId }, 'assistant', text);
    },
  });

  return result.toDataStreamResponse();
}

When you debug a bad answer, log what the agent saw, on both sides

Production agent bugs almost never trace back to the model. They trace to retrieval: the agent answered from a stale prompt, a wrong support article, or a chat history that got truncated. To debug that you need to log what the agent actually saw per turn, and now that state and content live in two systems, your trace has to capture both.

Keep using Turso for the conversation trace: the scoped, TTL'd `messages` rows already are your per-session log of what the user and assistant exchanged. Add a lightweight `retrievals` table (also TTL'd) that records, per turn, which content document the agent pulled, the GROQ query or the MCP tool call and the document `_id` and `_rev` it returned. When a wrong answer comes in, you join the assistant turn to the retrieval row and immediately see whether the model got the wrong document or the right document and reasoned badly.

This is also where the versioned-content decision pays off. Because the system prompt and knowledge live in Sanity with a `_rev`, your retrieval log can pin the exact revision the agent saw. A prompt change last Tuesday that started producing bad answers shows up as a `_rev` delta in your trace, not as an untraceable `UPDATE` in a libSQL row you can't diff. Wire the same trace into Braintrust, LangSmith, or OpenTelemetry if you have it; the point is that both halves, Turso state and Sanity content, show up in the same span.

A TTL'd retrieval trace alongside the chat log

import { turso } from './turso';

// CREATE TABLE retrievals (
//   id TEXT PRIMARY KEY, org_id TEXT, session_id TEXT, turn_id TEXT,
//   doc_id TEXT, doc_rev TEXT, query TEXT, created_at INTEGER, expires_at INTEGER
// );
export async function logRetrieval(scope: {
  orgId: string; sessionId: string; turnId: string;
}, hit: { docId: string; docRev: string; query: string }) {
  await turso.execute({
    sql: `INSERT INTO retrievals
          (id, org_id, session_id, turn_id, doc_id, doc_rev, query, created_at, expires_at)
          VALUES (?, ?, ?, ?, ?, ?, ?, unixepoch(), unixepoch() + 1209600)`,
    args: [crypto.randomUUID(), scope.orgId, scope.sessionId, scope.turnId,
           hit.docId, hit.docRev, hit.query],
  });
}