Skip to main content
You can’t fix what you can’t see. The manager writes one row to ai_call per attempt — success, error, or cancelled — including tokens, cost, latency, and the prompt version that drove it. That single table powers the admin dashboard at /admin/cost and is all you need to spot a runaway model or a broken provider.

Prerequisites

  • Migrated DB so the ai_call table exists (pnpm db:push).
  • A few real AI calls behind you — the table is empty until then.
  • Read src/ai/manager.ts recordCall() — that’s where every row is born.

The columns that matter

// src/db/ai.schema.ts (excerpt)
ai_call {
  id              text PRIMARY KEY
  userId          text NULL                            // FK to user
  provider        text NOT NULL                        // 'openai' | 'mock' | ...
  model           text NOT NULL                        // 'gpt-4o-mini'
  operation       text NOT NULL                        // 'chat' | 'chat_stream' | 'image'
  promptId        text NULL                            // FK to prompt
  promptVersionId text NULL                            // FK to prompt_version
  status          text NOT NULL DEFAULT 'pending'      // 'success'|'error'|'cancelled'|...
  streamed        boolean NOT NULL DEFAULT false
  cached          boolean NOT NULL DEFAULT false
  inputTokens     integer NOT NULL DEFAULT 0
  outputTokens    integer NOT NULL DEFAULT 0
  costMicroCents  bigint  NOT NULL DEFAULT 0           // 1/100¢ — divide by 1_000_000 for $
  ttftMs          integer NULL                         // streaming only
  totalMs         integer NULL
  errorMessage    text NULL
  createdAt       timestamp NOT NULL DEFAULT now()
}
Indexes on userId, (provider, model), status, createdAt, promptId cover the queries the dashboard runs.

Why micro-cents?

OpenAI charges $0.0001 per 1k input tokens for some models. A 200-token call costs 0.002¢. Storing in cents would force floats and round-off. Micro-cents = 1/100 of a cent, so:
  • $1.00 = 100¢ = 10_000 micro-cents
  • $0.0001 = 0.01¢ = 1 micro-cent
  • Display in dollars: microCents / 1_000_000
The <CostBadge /> primitive does this conversion. Don’t roll your own.

Step-by-step: add a usage report

You’ll write three queries: daily cost, per-model error rate, p95 latency. All run straight against ai_call.

Daily spend, last 30 days

SELECT
  date_trunc('day', created_at)            AS day,
  SUM(cost_micro_cents) / 1000000.0        AS dollars,
  COUNT(*)                                 AS calls
FROM ai_call
WHERE created_at >= now() - interval '30 days'
  AND status = 'success'
GROUP BY 1
ORDER BY 1 DESC;

Error rate per model

SELECT
  model,
  COUNT(*) FILTER (WHERE status = 'error')::float / COUNT(*) AS error_rate,
  COUNT(*)                                                   AS attempts
FROM ai_call
WHERE created_at >= now() - interval '7 days'
GROUP BY model
HAVING COUNT(*) > 10
ORDER BY error_rate DESC;

p95 total latency per model

SELECT
  model,
  percentile_cont(0.95) WITHIN GROUP (ORDER BY total_ms) AS p95_ms
FROM ai_call
WHERE created_at >= now() - interval '24 hours'
  AND status = 'success'
  AND total_ms IS NOT NULL
GROUP BY model;

The admin dashboard

/admin/cost (gated by adminActionClientuser.role === 'admin') reads ai_call and renders these three queries plus a leaderboard of top spenders. To add your own widget, drop a server component in src/app/[locale]/(app)/admin/cost/, import db, run a query, render. There’s no extra wiring.

Verify it works

  1. Make a chat call — any model, any provider.
  2. psql and run SELECT provider, model, status, cost_micro_cents, total_ms FROM ai_call ORDER BY created_at DESC LIMIT 1; — one row.
  3. Visit /admin/cost (signed in as admin). The “today” total should reflect your call.
  4. Force an error: pass a bogus model id. Re-check the table — a status = 'error' row appears with errorMessage populated.

Common pitfalls

  • Forgetting a pricing.ts entry → cost = 0. getTokenPrice falls back to mock:any, which still emits a tiny number, but anything you charge real money for should have an explicit row. Audit with SELECT model, SUM(cost_micro_cents) FROM ai_call GROUP BY model HAVING SUM(cost_micro_cents) < 100;
  • Confusing micro-cents with cents (off by 100). Spent two hours wondering why margins look amazing? You divided by 10_000 instead of 1_000_000.
  • Ignoring failed calls. They cost too — Anthropic charges for failed streams past TTFT. Filter status = 'success' only when you mean “what worked”; for cost, sum over all rows.
  • Cached rows look free. The manager sets cached = true and costMicroCents = 0 for cache hits. That’s correct — you didn’t pay — but if you’re sizing capacity, exclude cached rows from latency reports too.
  • ttftMs null for non-streaming. It’s only populated for operation = 'chat_stream'. Don’t aggregate ttftMs over chat (non-streaming) rows; they’re all NULL.

Official docs