Observability

You can’t fix what you can’t see. The manager writes one row to ai_call per attempt — success, error, or cancelled — including tokens, cost, latency, and the prompt version that drove it. That single table powers the admin dashboard at /admin/cost and is all you need to spot a runaway model or a broken provider.

Prerequisites

Migrated DB so the ai_call table exists (pnpm db:push).
A few real AI calls behind you — the table is empty until then.
Read src/ai/manager.ts recordCall() — that’s where every row is born.

The columns that matter

// src/db/ai.schema.ts (excerpt)
ai_call {
  id              text PRIMARY KEY
  userId          text NULL                            // FK to user
  provider        text NOT NULL                        // 'openai' | 'mock' | ...
  model           text NOT NULL                        // 'gpt-4o-mini'
  operation       text NOT NULL                        // 'chat' | 'chat_stream' | 'image'
  promptId        text NULL                            // FK to prompt
  promptVersionId text NULL                            // FK to prompt_version
  status          text NOT NULL DEFAULT 'pending'      // 'success'|'error'|'cancelled'|...
  streamed        boolean NOT NULL DEFAULT false
  cached          boolean NOT NULL DEFAULT false
  inputTokens     integer NOT NULL DEFAULT 0
  outputTokens    integer NOT NULL DEFAULT 0
  costMicroCents  bigint  NOT NULL DEFAULT 0           // 1/100¢ — divide by 1_000_000 for $
  ttftMs          integer NULL                         // streaming only
  totalMs         integer NULL
  errorMessage    text NULL
  createdAt       timestamp NOT NULL DEFAULT now()
}

Indexes on userId, (provider, model), status, createdAt, promptId cover the queries the dashboard runs.

Why micro-cents?

OpenAI charges $0.0001 per 1k input tokens for some models. A 200-token call costs 0.002¢. Storing in cents would force floats and round-off. Micro-cents = 1/100 of a cent, so:

$1.00 = 100¢ = 10_000 micro-cents
$0.0001 = 0.01¢ = 1 micro-cent
Display in dollars: microCents / 1_000_000

The <CostBadge /> primitive does this conversion. Don’t roll your own.

Step-by-step: add a usage report

You’ll write three queries: daily cost, per-model error rate, p95 latency. All run straight against ai_call.

Daily spend, last 30 days

SELECT
  date_trunc('day', created_at)            AS day,
  SUM(cost_micro_cents) / 1000000.0        AS dollars,
  COUNT(*)                                 AS calls
FROM ai_call
WHERE created_at >= now() - interval '30 days'
  AND status = 'success'
GROUP BY 1
ORDER BY 1 DESC;

Error rate per model

SELECT
  model,
  COUNT(*) FILTER (WHERE status = 'error')::float / COUNT(*) AS error_rate,
  COUNT(*)                                                   AS attempts
FROM ai_call
WHERE created_at >= now() - interval '7 days'
GROUP BY model
HAVING COUNT(*) > 10
ORDER BY error_rate DESC;

p95 total latency per model

SELECT
  model,
  percentile_cont(0.95) WITHIN GROUP (ORDER BY total_ms) AS p95_ms
FROM ai_call
WHERE created_at >= now() - interval '24 hours'
  AND status = 'success'
  AND total_ms IS NOT NULL
GROUP BY model;

The admin dashboard

/admin/cost (gated by adminActionClient → user.role === 'admin') reads ai_call and renders these three queries plus a leaderboard of top spenders. To add your own widget, drop a server component in src/app/[locale]/(app)/admin/cost/, import db, run a query, render. There’s no extra wiring.

Verify it works

Make a chat call — any model, any provider.
psql and run SELECT provider, model, status, cost_micro_cents, total_ms FROM ai_call ORDER BY created_at DESC LIMIT 1; — one row.
Visit /admin/cost (signed in as admin). The “today” total should reflect your call.
Force an error: pass a bogus model id. Re-check the table — a status = 'error' row appears with errorMessage populated.

Common pitfalls

Forgetting a pricing.ts entry → cost = 0. getTokenPrice falls back to mock:any, which still emits a tiny number, but anything you charge real money for should have an explicit row. Audit with SELECT model, SUM(cost_micro_cents) FROM ai_call GROUP BY model HAVING SUM(cost_micro_cents) < 100;
Confusing micro-cents with cents (off by 100). Spent two hours wondering why margins look amazing? You divided by 10_000 instead of 1_000_000.
Ignoring failed calls. They cost too — Anthropic charges for failed streams past TTFT. Filter status = 'success' only when you mean “what worked”; for cost, sum over all rows.
Cached rows look free. The manager sets cached = true and costMicroCents = 0 for cache hits. That’s correct — you didn’t pay — but if you’re sizing capacity, exclude cached rows from latency reports too.
ttftMs null for non-streaming. It’s only populated for operation = 'chat_stream'. Don’t aggregate ttftMs over chat (non-streaming) rows; they’re all NULL.

Official docs

PostgreSQL aggregates: postgresql.org/docs/current/functions-aggregate.html
percentile_cont: postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE
Drizzle raw SQL: orm.drizzle.team/docs/sql
Source: src/db/ai.schema.ts, src/ai/manager.ts

Getting started

Authentication

Payments

AI primitives

Growth

Content & i18n

Operations

Deployment

Reference

Prerequisites

The columns that matter

Why micro-cents?

Step-by-step: add a usage report

Daily spend, last 30 days

Error rate per model

p95 total latency per model

The admin dashboard

Verify it works

Common pitfalls

Official docs

Getting started

Authentication

Payments

AI primitives

Growth

Content & i18n

Operations

Deployment

Reference

​Prerequisites

​The columns that matter

​Why micro-cents?

​Step-by-step: add a usage report

​Daily spend, last 30 days

​Error rate per model

​p95 total latency per model

​The admin dashboard

​Verify it works

​Common pitfalls

​Official docs

Prerequisites

The columns that matter

Why micro-cents?

Step-by-step: add a usage report

Daily spend, last 30 days

Error rate per model

p95 total latency per model

The admin dashboard

Verify it works

Common pitfalls

Official docs