Record & Replay - Lumenfall

Record & Replay captures a real AI response once and serves it back on demand — free, instant, and identical on every call. And when you don’t want to record at all, mock mode synthesizes a realistic, correctly-shaped response on the spot, so you can build and test before you spend a cent.

How to turn it on

Per request — add a header

Add one header to any image or video request:

X-Lumenfall-Replay: replay-or-mock

replay-or-mock is one of several activation modes — it serves a recording if one exists and returns an instant, free mock if not, so the call never fails or hits the provider on a miss. See all activation modes for the full set, or follow the walkthrough below to record and replay end to end.

Per API key — set a default

Rather than set a header on every call, give an API key a default mode in your dashboard under API Keys, then point your existing code at it — same code, just a different key. Per-request headers still take precedence (send X-Lumenfall-Replay: off to bypass replay for one call), so a single key can drive a whole environment:

replay-or-mock for local dev — never fails, never costs.
replay-or-error for CI — fail loudly if a fixture is missing.
replay-or-record for staging — capture new fixtures.
no default for production — normal operation.

See Authentication for creating and managing keys.

Replay covers image and video generation on Lumenfall’s /openai/v1/* endpoints. It doesn’t apply to chat/completions or streaming responses.

How recordings are matched

Replay is configured by two settings:

X-Lumenfall-Replay (activation) — what happens: record, one of the replay-or-* outcomes, or mock.
X-Lumenfall-Replay-Match (match) — how a recording is matched when replaying.

Matching is the part worth understanding up front. By default Lumenfall uses standard matching: it hashes the request’s canonical parameters and ignores cosmetic differences — most importantly the prompt — so a recording keeps matching as you iterate on wording. The parameters that actually change the output, like size, seed, or quality, do change the match. Switch to strict for byte-exact matching, or pinned / specific for finer control. Recordings match across providers. Because Lumenfall routes the same model to different providers, standard matching follows: a recording captured against one provider can answer a request that routes to another, as long as both resolve to the same model. See Modes & matching for every activation and match strategy.

Why use it

Build before you integrate — mock mode returns realistic, free responses with zero setup. Develop your UI and code paths before the real model is even wired up.
Spend nothing in development — replay a recorded response (or a mock) instead of paying for the same generation again.
Deterministic tests — pin a response and get the exact same bytes on every run, so snapshot and integration tests stop flaking.
Debug real flows — capture a tricky generation once and replay it as many times as you need to reproduce a bug.
No new dependencies — if your SDK can set a base_url, it works.

How it works

The recorder sits between your code and the provider, and one setting chooses what happens:

Record forwards to the real provider and stores the exact response bytes.
Replay returns the stored response instead of calling the provider.
Mock skips the provider and storage entirely and synthesizes a response in the endpoint’s shape.

record   your code ─▶ Lumenfall ─▶ provider ─▶ recording stored
replay   your code ─▶ Lumenfall ─▶ recording      (no provider call)
mock     your code ─▶ Lumenfall ─▶ synthesized     (no provider, no storage)

Either way you’re exercising your real code — the provider’s response shape, your parsing, and your error handling all run identically.

Walkthrough

This records a response, replays it, and mocks one — all with per-request headers. Set your base_url to Lumenfall and use your normal API key; no other setup.

from openai import OpenAI

client = OpenAI(
    api_key="your-lumenfall-api-key",
    base_url="https://api.lumenfall.ai/openai/v1",
)

Step 1 — Record

Send a normal request with X-Lumenfall-Replay: record. You get a real response back, and Lumenfall stores it. The response carries an X-Lumenfall-Recording-Id header.

curl https://api.lumenfall.ai/openai/v1/images/generations \
  -H "Authorization: Bearer $LUMENFALL_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Lumenfall-Replay: record" \
  -d '{
    "model": "gpt-image-1.5",
    "prompt": "A capybara wearing a tiny hat, sitting in a hot spring",
    "size": "1024x1024"
  }'
# Response header: X-Lumenfall-Recording-Id: rec_2jKx8mNpQ4abc123

Step 2 — Replay

Send the same request body with X-Lumenfall-Replay: replay-or-mock. Lumenfall matches it to your recording and returns the stored response — instantly, with no provider call. And if nothing matches, you get a synthesized mock instead of an error, so the call never fails or costs anything.

curl https://api.lumenfall.ai/openai/v1/images/generations \
  -H "Authorization: Bearer $LUMENFALL_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Lumenfall-Replay: replay-or-mock" \
  -d '{
    "model": "gpt-image-1.5",
    "prompt": "A capybara wearing a tiny hat, sitting in a hot spring",
    "size": "1024x1024"
  }'
# Response header: X-Lumenfall-Replay-Result: replay

The replayed response is byte-for-byte what the provider returned — including any generated image, which Lumenfall downloads and stores so the link doesn’t expire.

Step 3 — Mock (no recording needed)

Send X-Lumenfall-Replay: mock and Lumenfall skips the provider and storage entirely, synthesizing a well-formed response in the endpoint’s shape — instantly and for free, with no recording required. Reach for it to build and demo before a real integration exists, to develop offline, or in CI where you just need a valid-shaped response on every call.

curl https://api.lumenfall.ai/openai/v1/images/generations \
  -H "Authorization: Bearer $LUMENFALL_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Lumenfall-Replay: mock" \
  -d '{
    "model": "gpt-image-1.5",
    "prompt": "anything you like",
    "size": "1024x1024"
  }'
# Response header: X-Lumenfall-Replay-Result: mock

Next steps

Modes & matching

Every activation and match strategy, what happens on a miss, response signals, and latency simulation.

​How to turn it on

​Per request — add a header

​Per API key — set a default

​How recordings are matched

​Why use it

​How it works

​Walkthrough

​Step 1 — Record

​Step 2 — Replay

​Step 3 — Mock (no recording needed)

​Next steps

Modes & matching

How to turn it on

Per request — add a header

Per API key — set a default

How recordings are matched

Why use it

How it works

Walkthrough

Step 1 — Record

Step 2 — Replay

Step 3 — Mock (no recording needed)

Next steps