> ## Documentation Index
> Fetch the complete documentation index at: https://docs.lumenfall.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Record & Replay

> Replay recorded AI responses, or get instant free mock responses. Switch it on per API key in the dashboard, or per request with a single header.

Record & Replay captures a real AI response once and serves it back on demand: free, instant, and identical on every call. And when you don't want to record at all, **mock mode** synthesizes a realistic, correctly-shaped response on the spot, so you can build and test before you spend a cent.

## How to turn it on

### Per request: add a header

Add one header to any image or video request:

```http theme={null}
X-Lumenfall-Replay: replay-or-mock
```

`replay-or-mock` is one of several **activation modes**. It serves a recording if one exists and returns an instant, free **mock** if not, so the call never fails or hits the provider on a miss. See [all activation modes](/replay/modes) for the full set, or follow the [walkthrough](#walkthrough) below to record and replay end to end.

### Per API key: set a default

Rather than set a header on every call, give an API key a default mode in your [dashboard](https://lumenfall.ai/app) under **API Keys**, then point your existing code at it. Same code, just a different key. Per-request headers still take precedence (send `X-Lumenfall-Replay: off` to bypass replay for one call), so a single key can drive a whole environment:

* `replay-or-mock` for **local dev**: never fails, never costs.
* `replay-or-error` for **CI**: fail loudly if a fixture is missing.
* `replay-or-record` for **staging**: capture new fixtures.
* no default for **production**: normal operation.

See [Authentication](/authentication) for creating and managing keys.

<Note>
  Replay covers **image and video generation** on Lumenfall's `/openai/v1/*` endpoints. It doesn't apply to chat/completions or streaming responses.
</Note>

## How recordings are matched

Replay is configured by two settings:

* **`X-Lumenfall-Replay`** (activation) sets *what happens*: `record`, one of the `replay-or-*` outcomes, or `mock`.
* **`X-Lumenfall-Replay-Match`** (match) sets *how a recording is matched* when replaying.

Matching is the part worth understanding up front. By default Lumenfall uses **`standard`** matching: it hashes the request's *canonical* parameters and ignores cosmetic differences (most importantly the prompt), so a recording keeps matching as you iterate on wording. The parameters that actually change the output, like size, seed, or quality, do change the match. Switch to `strict` for byte-exact matching, or `pinned` / `specific` for finer control.

**Recordings match across providers.** Because Lumenfall [routes](/routing) the same model to different providers, `standard` matching follows: a recording captured against one provider can answer a request that routes to another, as long as both resolve to the same model.

See [Modes & matching](/replay/modes) for every activation and match strategy.

## Why use it

* **Build before you integrate**: `mock` mode returns realistic, free responses with zero setup. Develop your UI and code paths before the real model is even wired up.
* **Spend nothing in development**: replay a recorded response (or a mock) instead of paying for the same generation again.
* **Deterministic tests**: pin a response and get the exact same bytes on every run, so snapshot and integration tests stop flaking.
* **Debug real flows**: capture a tricky generation once and replay it as many times as you need to reproduce a bug.
* **No new dependencies**: if your SDK can set a `base_url`, it works.

## How it works

The recorder sits between your code and the provider, and one setting chooses what happens:

* **Record** forwards to the real provider and stores the exact response bytes.
* **Replay** returns the stored response instead of calling the provider.
* **Mock** skips the provider and storage entirely and synthesizes a response in the endpoint's shape.

```
record   your code ─▶ Lumenfall ─▶ provider ─▶ recording stored
replay   your code ─▶ Lumenfall ─▶ recording      (no provider call)
mock     your code ─▶ Lumenfall ─▶ synthesized     (no provider, no storage)
```

Either way you're exercising your real code: the provider's response shape, your parsing, and your error handling all run identically.

## Walkthrough

This records a response, replays it, and mocks one, all with per-request headers. Set your `base_url` to Lumenfall and use your normal API key; no other setup.

<CodeGroup>
  ```python Python theme={null}
  from openai import OpenAI

  client = OpenAI(
      api_key="your-lumenfall-api-key",
      base_url="https://api.lumenfall.ai/openai/v1",
  )
  ```

  ```typescript TypeScript theme={null}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: "your-lumenfall-api-key",
    baseURL: "https://api.lumenfall.ai/openai/v1",
  });
  ```

  ```bash cURL theme={null}
  export LUMENFALL_API_KEY="your-lumenfall-api-key"
  ```
</CodeGroup>

### Step 1: Record

Send a normal request with `X-Lumenfall-Replay: record`. You get a real response back, and Lumenfall stores it. The response carries an `X-Lumenfall-Recording-Id` header.

<CodeGroup>
  ```bash cURL theme={null}
  curl https://api.lumenfall.ai/openai/v1/images/generations \
    -H "Authorization: Bearer $LUMENFALL_API_KEY" \
    -H "Content-Type: application/json" \
    -H "X-Lumenfall-Replay: record" \
    -d '{
      "model": "gpt-image-1.5",
      "prompt": "A capybara wearing a tiny hat, sitting in a hot spring",
      "size": "1024x1024"
    }'
  # Response header: X-Lumenfall-Recording-Id: rec_2jKx8mNpQ4abc123
  ```

  ```python Python theme={null}
  response = client.images.generate(
      model="gpt-image-1.5",
      prompt="A capybara wearing a tiny hat, sitting in a hot spring",
      size="1024x1024",
      extra_headers={"X-Lumenfall-Replay": "record"},
  )
  ```

  ```typescript TypeScript theme={null}
  const response = await client.images.generate({
    model: "gpt-image-1.5",
    prompt: "A capybara wearing a tiny hat, sitting in a hot spring",
    size: "1024x1024",
  }, {
    headers: { "X-Lumenfall-Replay": "record" },
  });
  ```
</CodeGroup>

### Step 2: Replay

Send the same request body with `X-Lumenfall-Replay: replay-or-mock`. Lumenfall matches it to your recording and returns the stored response instantly, with no provider call. And if nothing matches, you get a synthesized mock instead of an error, so the call never fails or costs anything.

<CodeGroup>
  ```bash cURL theme={null}
  curl https://api.lumenfall.ai/openai/v1/images/generations \
    -H "Authorization: Bearer $LUMENFALL_API_KEY" \
    -H "Content-Type: application/json" \
    -H "X-Lumenfall-Replay: replay-or-mock" \
    -d '{
      "model": "gpt-image-1.5",
      "prompt": "A capybara wearing a tiny hat, sitting in a hot spring",
      "size": "1024x1024"
    }'
  # Response header: X-Lumenfall-Replay-Result: replay
  ```

  ```python Python theme={null}
  replayed = client.images.generate(
      model="gpt-image-1.5",
      prompt="A capybara wearing a tiny hat, sitting in a hot spring",
      size="1024x1024",
      extra_headers={"X-Lumenfall-Replay": "replay-or-mock"},
  )
  ```

  ```typescript TypeScript theme={null}
  const replayed = await client.images.generate({
    model: "gpt-image-1.5",
    prompt: "A capybara wearing a tiny hat, sitting in a hot spring",
    size: "1024x1024",
  }, {
    headers: { "X-Lumenfall-Replay": "replay-or-mock" },
  });
  ```
</CodeGroup>

The replayed response is byte-for-byte what the provider returned, including any generated image, which Lumenfall downloads and stores so the link doesn't expire.

### Step 3: Mock (no recording needed)

Send `X-Lumenfall-Replay: mock` and Lumenfall skips the provider and storage entirely, synthesizing a well-formed response in the endpoint's shape, instantly and for free, with no recording required. Reach for it to build and demo before a real integration exists, to develop offline, or in CI where you just need a valid-shaped response on every call.

<CodeGroup>
  ```bash cURL theme={null}
  curl https://api.lumenfall.ai/openai/v1/images/generations \
    -H "Authorization: Bearer $LUMENFALL_API_KEY" \
    -H "Content-Type: application/json" \
    -H "X-Lumenfall-Replay: mock" \
    -d '{
      "model": "gpt-image-1.5",
      "prompt": "anything you like",
      "size": "1024x1024"
    }'
  # Response header: X-Lumenfall-Replay-Result: mock
  ```

  ```python Python theme={null}
  mocked = client.images.generate(
      model="gpt-image-1.5",
      prompt="anything you like",
      size="1024x1024",
      extra_headers={"X-Lumenfall-Replay": "mock"},
  )
  ```

  ```typescript TypeScript theme={null}
  const mocked = await client.images.generate({
    model: "gpt-image-1.5",
    prompt: "anything you like",
    size: "1024x1024",
  }, {
    headers: { "X-Lumenfall-Replay": "mock" },
  });
  ```
</CodeGroup>

## Next steps

<CardGroup cols={1}>
  <Card title="Modes & matching" icon="sliders" href="/replay/modes">
    Every activation and match strategy, what happens on a miss, response signals, and latency simulation.
  </Card>
</CardGroup>