Skip to main content
Record & Replay captures a real AI response once and serves it back on demand — free, instant, and identical on every call. And when you don’t want to record at all, mock mode synthesizes a realistic, correctly-shaped response on the spot, so you can build and test before you spend a cent.

How to turn it on

Per request — add a header

Add one header to any image or video request:
X-Lumenfall-Replay: replay-or-mock
replay-or-mock is one of several activation modes — it serves a recording if one exists and returns an instant, free mock if not, so the call never fails or hits the provider on a miss. See all activation modes for the full set, or follow the walkthrough below to record and replay end to end.

Per API key — set a default

Rather than set a header on every call, give an API key a default mode in your dashboard under API Keys, then point your existing code at it — same code, just a different key. Per-request headers still take precedence (send X-Lumenfall-Replay: off to bypass replay for one call), so a single key can drive a whole environment:
  • replay-or-mock for local dev — never fails, never costs.
  • replay-or-error for CI — fail loudly if a fixture is missing.
  • replay-or-record for staging — capture new fixtures.
  • no default for production — normal operation.
See Authentication for creating and managing keys.
Replay covers image and video generation on Lumenfall’s /openai/v1/* endpoints. It doesn’t apply to chat/completions or streaming responses.

How recordings are matched

Replay is configured by two settings:
  • X-Lumenfall-Replay (activation) — what happens: record, one of the replay-or-* outcomes, or mock.
  • X-Lumenfall-Replay-Match (match) — how a recording is matched when replaying.
Matching is the part worth understanding up front. By default Lumenfall uses standard matching: it hashes the request’s canonical parameters and ignores cosmetic differences — most importantly the prompt — so a recording keeps matching as you iterate on wording. The parameters that actually change the output, like size, seed, or quality, do change the match. Switch to strict for byte-exact matching, or pinned / specific for finer control. Recordings match across providers. Because Lumenfall routes the same model to different providers, standard matching follows: a recording captured against one provider can answer a request that routes to another, as long as both resolve to the same model. See Modes & matching for every activation and match strategy.

Why use it

  • Build before you integratemock mode returns realistic, free responses with zero setup. Develop your UI and code paths before the real model is even wired up.
  • Spend nothing in development — replay a recorded response (or a mock) instead of paying for the same generation again.
  • Deterministic tests — pin a response and get the exact same bytes on every run, so snapshot and integration tests stop flaking.
  • Debug real flows — capture a tricky generation once and replay it as many times as you need to reproduce a bug.
  • No new dependencies — if your SDK can set a base_url, it works.

How it works

The recorder sits between your code and the provider, and one setting chooses what happens:
  • Record forwards to the real provider and stores the exact response bytes.
  • Replay returns the stored response instead of calling the provider.
  • Mock skips the provider and storage entirely and synthesizes a response in the endpoint’s shape.
record   your code ─▶ Lumenfall ─▶ provider ─▶ recording stored
replay   your code ─▶ Lumenfall ─▶ recording      (no provider call)
mock     your code ─▶ Lumenfall ─▶ synthesized     (no provider, no storage)
Either way you’re exercising your real code — the provider’s response shape, your parsing, and your error handling all run identically.

Walkthrough

This records a response, replays it, and mocks one — all with per-request headers. Set your base_url to Lumenfall and use your normal API key; no other setup.
from openai import OpenAI

client = OpenAI(
    api_key="your-lumenfall-api-key",
    base_url="https://api.lumenfall.ai/openai/v1",
)

Step 1 — Record

Send a normal request with X-Lumenfall-Replay: record. You get a real response back, and Lumenfall stores it. The response carries an X-Lumenfall-Recording-Id header.
curl https://api.lumenfall.ai/openai/v1/images/generations \
  -H "Authorization: Bearer $LUMENFALL_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Lumenfall-Replay: record" \
  -d '{
    "model": "gpt-image-1.5",
    "prompt": "A capybara wearing a tiny hat, sitting in a hot spring",
    "size": "1024x1024"
  }'
# Response header: X-Lumenfall-Recording-Id: rec_2jKx8mNpQ4abc123

Step 2 — Replay

Send the same request body with X-Lumenfall-Replay: replay-or-mock. Lumenfall matches it to your recording and returns the stored response — instantly, with no provider call. And if nothing matches, you get a synthesized mock instead of an error, so the call never fails or costs anything.
curl https://api.lumenfall.ai/openai/v1/images/generations \
  -H "Authorization: Bearer $LUMENFALL_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Lumenfall-Replay: replay-or-mock" \
  -d '{
    "model": "gpt-image-1.5",
    "prompt": "A capybara wearing a tiny hat, sitting in a hot spring",
    "size": "1024x1024"
  }'
# Response header: X-Lumenfall-Replay-Result: replay
The replayed response is byte-for-byte what the provider returned — including any generated image, which Lumenfall downloads and stores so the link doesn’t expire.

Step 3 — Mock (no recording needed)

Send X-Lumenfall-Replay: mock and Lumenfall skips the provider and storage entirely, synthesizing a well-formed response in the endpoint’s shape — instantly and for free, with no recording required. Reach for it to build and demo before a real integration exists, to develop offline, or in CI where you just need a valid-shaped response on every call.
curl https://api.lumenfall.ai/openai/v1/images/generations \
  -H "Authorization: Bearer $LUMENFALL_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Lumenfall-Replay: mock" \
  -d '{
    "model": "gpt-image-1.5",
    "prompt": "anything you like",
    "size": "1024x1024"
  }'
# Response header: X-Lumenfall-Replay-Result: mock

Next steps

Modes & matching

Every activation and match strategy, what happens on a miss, response signals, and latency simulation.