Presigned URLs and Your Own CDN for AI Video
fal.media URLs are persistent but not yours. A short pipeline that gives you ownership and still sub-100ms delivery.
Why fal.media is not your forever URL
fal.media URLs resolve fast, survive retries, persist for a reasonable window. What they are not: yours. If a specific video must be playable at a specific URL six months from now, serve it from your CDN.
The move is standard: fal generates, you mirror on completion, deliver through your CDN with presigned URLs for gated content. Inference latency stays fal-fast; delivery stays edge-fast; the canonical URL is one you control.
The lifecycle

- Generate: call fal, pocket the
fal.mediaURL. - Mirror: on webhook completion, download from fal.media, upload to your bucket.
- Deliver: serve from your CDN, sign if gated.
Each step is boring. The trick is making them idempotent.
The mirror step

1import { S3Client, PutObjectCommand } from "@aws-sdk/client-s3";23const s3 = new S3Client({ region: "auto", endpoint: process.env.R2_ENDPOINT });45export async function POST(req: Request) {6 const body = await req.text();7 if (!verify(body, headers().get("x-fal-signature"))) return new Response("bad", { status: 401 });8 const payload = JSON.parse(body);9 if (payload.status !== "COMPLETED") return new Response("ok");1011 const falUrl = payload.payload.video.url as string;12 const key = `videos/${payload.request_id}.mp4`;1314 const exists = await db.oneOrNone(15 "SELECT cdn_url FROM generations WHERE request_id=$1",16 [payload.request_id],17 );18 if (exists?.cdn_url) return new Response("ok");1920 const res = await fetch(falUrl);21 const buf = Buffer.from(await res.arrayBuffer());22 await s3.send(new PutObjectCommand({23 Bucket: process.env.R2_BUCKET!,24 Key: key,25 Body: buf,26 ContentType: "video/mp4",27 }));2829 const cdnUrl = `https://cdn.yourapp.com/${key}`;30 await db.query(31 "UPDATE generations SET cdn_url=$1, status='COMPLETED' WHERE request_id=$2",32 [cdnUrl, payload.request_id],33 );3435 return new Response("ok");36}
Run on Node, not edge. Fetching a 50 MB video and streaming into an S3 put is not for edge CPU budgets.
Offload if runtime is tight
For a 15 second Kling v3 Pro clip (50 to 80 MB), the mirror round trip adds 2 to 6 seconds of runtime. If your function is capped at 10 seconds, offload.
1await db.query(2 "UPDATE generations SET fal_url=$1, status='MIRRORING' WHERE request_id=$2",3 [falUrl, payload.request_id],4);5await queueWorker.send({ type: "mirror", requestId: payload.request_id });
A dedicated worker with a longer duration pulls the job, does the mirror, writes cdn_url.
Presigned URLs for gated content
1import { GetObjectCommand } from "@aws-sdk/client-s3";2import { getSignedUrl } from "@aws-sdk/s3-request-presigner";34export async function signedUrlFor(key: string, ttlSeconds = 900) {5 return getSignedUrl(6 s3,7 new GetObjectCommand({ Bucket: process.env.R2_BUCKET!, Key: key }),8 { expiresIn: ttlSeconds },9 );10}
15 minute TTL is reasonable for a user about to watch. For sharing, longer or public namespace.
Keep both URLs
In your database, keep fal_url and cdn_url. The fal URL is useful for debugging and support threads.
1ALTER TABLE generations ADD COLUMN fal_url TEXT;2ALTER TABLE generations ADD COLUMN cdn_url TEXT;3ALTER TABLE generations ADD COLUMN mirrored_at TIMESTAMPTZ;
What this costs
R2 egress to end users: free through Cloudflare.
R2 storage: about $0.015/GB/month. 10,000 videos at 30 MB each = 300 GB = $4.50/month.
S3 egress: $0.05 to $0.09/GB. Same 10,000 videos watched ten times each = 3 TB out = $150 to $270/month.
Generation still dominates: 10,000 Veo 3.1 Lite clips at 6 seconds is $1.80 each, $18,000. Delivery is a rounding error on R2, a small percent on S3.
The anti-pattern
Do not let end users stream from fal.media in production. It works. It is fast. It is not yours. Analytics, cache control, retention, auth, all become problems when the URL is on someone else's domain.