Integration3 min readApr 14, 2026

Persisting Generation History in Postgres

A schema for fal generation metadata with indexes that stay fast at 100k rows. Ships with SQL.

The one table you need

You do not need a star schema. You need one table with enough metadata to answer the five questions that come up weekly:

How much did we spend this week, by model?
Which user's job is stuck?
How often does endpoint X fail?
What was the exact prompt for this render?
Can I replay last Tuesday's failed jobs?

Here is the minimum that answers all of them.

The schema

SQL

1CREATE TABLE generations (
2  id            BIGSERIAL PRIMARY KEY,
3  request_id    TEXT UNIQUE NOT NULL,
4  user_id       TEXT NOT NULL,
5  endpoint      TEXT NOT NULL,
6  status        TEXT NOT NULL CHECK (status IN
7                   ('IN_QUEUE','IN_PROGRESS','COMPLETED','FAILED','CANCELLED')),
8  inputs        JSONB NOT NULL,
9  output_url    TEXT,
10  seed          BIGINT,
11  duration_s    REAL,
12  cost_cents    INTEGER,
13  error         TEXT,
14  created_at    TIMESTAMPTZ NOT NULL DEFAULT now(),
15  finished_at   TIMESTAMPTZ
16);

Eleven columns. inputs holds the whole payload as JSONB so you can query without a second table.

The indexes that keep you fast at 100k rows

SQL

1CREATE INDEX generations_user_created_idx
2  ON generations (user_id, created_at DESC);
3
4CREATE UNIQUE INDEX generations_request_idx ON generations (request_id);
5
6CREATE INDEX generations_endpoint_status_idx
7  ON generations (endpoint, status);
8
9CREATE INDEX generations_inputs_gin ON generations USING GIN (inputs);

Four indexes. First three handle OLTP queries. The GIN index lets you run WHERE inputs @> '{"resolution":"4k"}' without a sequential scan.

Writing the row at submit time

Every job writes a row before the network call returns. That row is your recovery handle.

TYPESCRIPT

1const estimated = Math.round(durationSec * pricePerSec * 100);
2
3const { request_id } = await fal.queue.submit(endpoint, {
4  input,
5  webhookUrl: `${process.env.APP_URL}/api/fal/webhook`,
6});
7
8await db.query(
9  `INSERT INTO generations (request_id, user_id, endpoint, status, inputs, duration_s, cost_cents)
10   VALUES ($1,$2,$3,'IN_QUEUE',$4,$5,$6)`,
11  [request_id, userId, endpoint, JSON.stringify(input), durationSec, estimated],
12);

cost_cents is an estimate at submit time. For Wan 2.7 ($0.10/sec) and Veo 3.1 ($0.40/sec) price is deterministic. For Seedance 2.0 ($0.014 per unit), reconcile on completion.

Updating on webhook

JSONB inputs column with GIN index stamp

SQL

1UPDATE generations
2SET status      = $1,
3    output_url  = $2,
4    seed        = $3,
5    finished_at = now()
6WHERE request_id = $4
7  AND status IN ('IN_QUEUE','IN_PROGRESS');

The AND status IN (...) clause is replay safety. A duplicate webhook updates zero rows and the handler still returns 200.

Queries you will write in week one

SQL

1-- spend this week by model
2SELECT endpoint, count(*) AS jobs, sum(cost_cents)/100.0 AS dollars
3FROM generations
4WHERE created_at > now() - interval '7 days' AND status = 'COMPLETED'
5GROUP BY endpoint ORDER BY dollars DESC;
6
7-- failure rate per endpoint
8SELECT endpoint,
9  count(*) FILTER (WHERE status='FAILED')::float / NULLIF(count(*),0) AS fail_rate
10FROM generations
11WHERE created_at > now() - interval '7 days'
12GROUP BY endpoint HAVING count(*) > 20;
13
14-- 4K Veo 3.1 renders this week
15SELECT request_id, user_id, cost_cents/100.0 AS dollars
16FROM generations
17WHERE endpoint='fal-ai/veo3.1' AND inputs @> '{"resolution":"4k"}'
18  AND created_at > now() - interval '7 days';
19
20-- stuck jobs
21SELECT request_id, endpoint, created_at FROM generations
22WHERE status IN ('IN_QUEUE','IN_PROGRESS')
23  AND created_at < now() - interval '30 minutes' LIMIT 50;

Retention

At 5000 renders/day that is 1.8M rows/year. Tiny for Postgres with these indexes. At 10M rows, partition by month or split inputs into a sibling table keyed on request_id. Do not do either before you have to.

What not to do

Do not store video bytes in Postgres; URL only. Do not put prompt as a top level column alongside inputs; two sources of truth drift. Do not skip the unique index on request_id; it is your webhook idempotency. Do not use a boolean is_completed; you need FAILED and CANCELLED later.

Four indexes, one JSONB, one status enum. Survives the first million rows without a migration.

Back to all posts

Blog