How to turn it on
Per request — add a header
Add one header to any image or video request:replay-or-mock is one of several activation modes — it serves a recording if one exists and returns an instant, free mock if not, so the call never fails or hits the provider on a miss. See all activation modes for the full set, or follow the walkthrough below to record and replay end to end.
Per API key — set a default
Rather than set a header on every call, give an API key a default mode in your dashboard under API Keys, then point your existing code at it — same code, just a different key. Per-request headers still take precedence (sendX-Lumenfall-Replay: off to bypass replay for one call), so a single key can drive a whole environment:
replay-or-mockfor local dev — never fails, never costs.replay-or-errorfor CI — fail loudly if a fixture is missing.replay-or-recordfor staging — capture new fixtures.- no default for production — normal operation.
Replay covers image and video generation on Lumenfall’s
/openai/v1/* endpoints. It doesn’t apply to chat/completions or streaming responses.How recordings are matched
Replay is configured by two settings:X-Lumenfall-Replay(activation) — what happens:record, one of thereplay-or-*outcomes, ormock.X-Lumenfall-Replay-Match(match) — how a recording is matched when replaying.
standard matching: it hashes the request’s canonical parameters and ignores cosmetic differences — most importantly the prompt — so a recording keeps matching as you iterate on wording. The parameters that actually change the output, like size, seed, or quality, do change the match. Switch to strict for byte-exact matching, or pinned / specific for finer control.
Recordings match across providers. Because Lumenfall routes the same model to different providers, standard matching follows: a recording captured against one provider can answer a request that routes to another, as long as both resolve to the same model.
See Modes & matching for every activation and match strategy.
Why use it
- Build before you integrate —
mockmode returns realistic, free responses with zero setup. Develop your UI and code paths before the real model is even wired up. - Spend nothing in development — replay a recorded response (or a mock) instead of paying for the same generation again.
- Deterministic tests — pin a response and get the exact same bytes on every run, so snapshot and integration tests stop flaking.
- Debug real flows — capture a tricky generation once and replay it as many times as you need to reproduce a bug.
- No new dependencies — if your SDK can set a
base_url, it works.
How it works
The recorder sits between your code and the provider, and one setting chooses what happens:- Record forwards to the real provider and stores the exact response bytes.
- Replay returns the stored response instead of calling the provider.
- Mock skips the provider and storage entirely and synthesizes a response in the endpoint’s shape.
Walkthrough
This records a response, replays it, and mocks one — all with per-request headers. Set yourbase_url to Lumenfall and use your normal API key; no other setup.
Step 1 — Record
Send a normal request withX-Lumenfall-Replay: record. You get a real response back, and Lumenfall stores it. The response carries an X-Lumenfall-Recording-Id header.
Step 2 — Replay
Send the same request body withX-Lumenfall-Replay: replay-or-mock. Lumenfall matches it to your recording and returns the stored response — instantly, with no provider call. And if nothing matches, you get a synthesized mock instead of an error, so the call never fails or costs anything.
Step 3 — Mock (no recording needed)
SendX-Lumenfall-Replay: mock and Lumenfall skips the provider and storage entirely, synthesizing a well-formed response in the endpoint’s shape — instantly and for free, with no recording required. Reach for it to build and demo before a real integration exists, to develop offline, or in CI where you just need a valid-shaped response on every call.
Next steps
Modes & matching
Every activation and match strategy, what happens on a miss, response signals, and latency simulation.