Blog

Integration3 min read

Persisting Generation History in Postgres

A schema for fal generation metadata with indexes that stay fast at 100k rows. Ships with SQL.


The one table you need

You do not need a star schema. You need one table with enough metadata to answer the five questions that come up weekly:

  • How much did we spend this week, by model?
  • Which user's job is stuck?
  • How often does endpoint X fail?
  • What was the exact prompt for this render?
  • Can I replay last Tuesday's failed jobs?

Here is the minimum that answers all of them.

The schema

SQL
1CREATE TABLE generations (
2 id BIGSERIAL PRIMARY KEY,
3 request_id TEXT UNIQUE NOT NULL,
4 user_id TEXT NOT NULL,
5 endpoint TEXT NOT NULL,
6 status TEXT NOT NULL CHECK (status IN
7 ('IN_QUEUE','IN_PROGRESS','COMPLETED','FAILED','CANCELLED')),
8 inputs JSONB NOT NULL,
9 output_url TEXT,
10 seed BIGINT,
11 duration_s REAL,
12 cost_cents INTEGER,
13 error TEXT,
14 created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
15 finished_at TIMESTAMPTZ
16);

Eleven columns. inputs holds the whole payload as JSONB so you can query without a second table.

The indexes that keep you fast at 100k rows

Three indexes tabbed over table diagram
Three indexes tabbed over table diagram
SQL
1CREATE INDEX generations_user_created_idx
2 ON generations (user_id, created_at DESC);
3
4CREATE UNIQUE INDEX generations_request_idx ON generations (request_id);
5
6CREATE INDEX generations_endpoint_status_idx
7 ON generations (endpoint, status);
8
9CREATE INDEX generations_inputs_gin ON generations USING GIN (inputs);

Four indexes. First three handle OLTP queries. The GIN index lets you run WHERE inputs @> '{"resolution":"4k"}' without a sequential scan.

Writing the row at submit time

Every job writes a row before the network call returns. That row is your recovery handle.

TYPESCRIPT
1const estimated = Math.round(durationSec * pricePerSec * 100);
2
3const { request_id } = await fal.queue.submit(endpoint, {
4 input,
5 webhookUrl: `${process.env.APP_URL}/api/fal/webhook`,
6});
7
8await db.query(
9 `INSERT INTO generations (request_id, user_id, endpoint, status, inputs, duration_s, cost_cents)
10 VALUES ($1,$2,$3,'IN_QUEUE',$4,$5,$6)`,
11 [request_id, userId, endpoint, JSON.stringify(input), durationSec, estimated],
12);

cost_cents is an estimate at submit time. For Wan 2.7 ($0.10/sec) and Veo 3.1 ($0.40/sec) price is deterministic. For Seedance 2.0 ($0.014 per unit), reconcile on completion.

Updating on webhook

JSONB inputs column with GIN index stamp
JSONB inputs column with GIN index stamp
SQL
1UPDATE generations
2SET status = $1,
3 output_url = $2,
4 seed = $3,
5 finished_at = now()
6WHERE request_id = $4
7 AND status IN ('IN_QUEUE','IN_PROGRESS');

The AND status IN (...) clause is replay safety. A duplicate webhook updates zero rows and the handler still returns 200.

Queries you will write in week one

SQL
1-- spend this week by model
2SELECT endpoint, count(*) AS jobs, sum(cost_cents)/100.0 AS dollars
3FROM generations
4WHERE created_at > now() - interval '7 days' AND status = 'COMPLETED'
5GROUP BY endpoint ORDER BY dollars DESC;
6
7-- failure rate per endpoint
8SELECT endpoint,
9 count(*) FILTER (WHERE status='FAILED')::float / NULLIF(count(*),0) AS fail_rate
10FROM generations
11WHERE created_at > now() - interval '7 days'
12GROUP BY endpoint HAVING count(*) > 20;
13
14-- 4K Veo 3.1 renders this week
15SELECT request_id, user_id, cost_cents/100.0 AS dollars
16FROM generations
17WHERE endpoint='fal-ai/veo3.1' AND inputs @> '{"resolution":"4k"}'
18 AND created_at > now() - interval '7 days';
19
20-- stuck jobs
21SELECT request_id, endpoint, created_at FROM generations
22WHERE status IN ('IN_QUEUE','IN_PROGRESS')
23 AND created_at < now() - interval '30 minutes' LIMIT 50;

Retention

At 5000 renders/day that is 1.8M rows/year. Tiny for Postgres with these indexes. At 10M rows, partition by month or split inputs into a sibling table keyed on request_id. Do not do either before you have to.

What not to do

Do not store video bytes in Postgres; URL only. Do not put prompt as a top level column alongside inputs; two sources of truth drift. Do not skip the unique index on request_id; it is your webhook idempotency. Do not use a boolean is_completed; you need FAILED and CANCELLED later.

Four indexes, one JSONB, one status enum. Survives the first million rows without a migration.