> ## Documentation Index
> Fetch the complete documentation index at: https://docs.lumenfall.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Modes & matching

> The two settings that control replay: the outcome you want, and how recordings are matched.

Replay is configured by two independent settings. Send each as a request header for one-off control, or set it as a default on an API key:

* **`X-Lumenfall-Replay`** (the activation) sets *the outcome*: what you get on a hit and on a miss.
* **`X-Lumenfall-Replay-Match`** (the match) sets *the lookup*: how Lumenfall decides whether a recording matches.

They're orthogonal. The activation fully determines the outcome; the match only changes how a recording is found. Any `replay-or-*` activation works with any match strategy.

## Activation: `X-Lumenfall-Replay`

The activation decides what happens, both when a recording is found and when one isn't.

| Activation                   | On a hit        | On a miss                                  |
| ---------------------------- | --------------- | ------------------------------------------ |
| `off`                        | n/a             | Normal request (not routed through replay) |
| `record`                     | Always forwards | Forward to provider, store recording       |
| `replay-or-mock` *(default)* | Serve recording | Synthesize a **mock** response             |
| `replay-or-error`            | Serve recording | `404 RECORDING_NOT_FOUND`                  |
| `replay-or-live`             | Serve recording | Forward to provider (no recording)         |
| `replay-or-record`           | Serve recording | Forward to provider **and** record         |
| `mock`                       | Never looks     | Synthesize a **mock** response             |

<Note title="Mock mode">
  `mock` never touches a provider or storage. It synthesizes a well-formed response in the endpoint's shape (including placeholder media), instantly and for free. Use it to build and demo your app before a real integration exists, to develop offline, or in CI where you just need a valid-shaped response on every call. `replay-or-mock` is the same synthesis used only as the fallback when a recording isn't found.
</Note>

## Match: `X-Lumenfall-Replay-Match`

The match strategy decides how Lumenfall looks for a recording. It is consulted **only** by the `replay-or-*` activations. `off`, `record`, and `mock` ignore it.

| Match                  | How it matches                                         | Companion header                               |
| ---------------------- | ------------------------------------------------------ | ---------------------------------------------- |
| `standard` *(default)* | Canonical parameters, tolerant of cosmetic differences | n/a                                            |
| `strict`               | Byte-exact request body                                | n/a                                            |
| `specific`             | Only the fields you choose                             | `X-Lumenfall-Replay-Fields: <comma-separated>` |
| `pinned`               | One explicit recording, by ID                          | `X-Lumenfall-Replay-Recording: rec_…`          |

`pinned` and `specific` each require their companion header. Omitting it returns `400` (`PINNED_MODE_REQUIRES_RECORDING` / `SPECIFIC_MODE_REQUIRES_FIELDS`).

<Tip>
  The match and the activation are independent, so pinning composes with any outcome. With `X-Lumenfall-Replay-Match: pinned` and `X-Lumenfall-Replay-Recording: rec_abc123`, the activation still decides what happens if that recording is gone: `replay-or-error` returns a `404`, while `replay-or-mock` returns a mock instead.
</Tip>

### How matching works

When you replay, Lumenfall builds a match key from your request and looks for a recording with the same key.

| Field            | Source                                   |
| ---------------- | ---------------------------------------- |
| **Organization** | The owner of the API key                 |
| **Provider**     | The provider that handles the request    |
| **Endpoint**     | The API endpoint (e.g. image generation) |
| **Model**        | The model identifier                     |
| **Request hash** | A hash of the request body               |

The first four fields are always part of the key. The match strategy changes how the **request hash** is computed:

* **`standard`** hashes the *canonical* parameters, ignoring fields that don't change the nature of the request: the **prompt**, the response/output format, and (for text-to-image and text-to-video) reference media. Changing the prompt still matches; changing something that defines the generation (size, seed, quality) does not.
* **`strict`** hashes the exact request body bytes. Any change, even whitespace, produces a different key and won't match.
* **`specific`** hashes only the fields you name in `X-Lumenfall-Replay-Fields`; everything else is ignored.
* **`pinned`** skips hashing entirely and serves the recording named in `X-Lumenfall-Replay-Recording`.

### Cross-provider matches

`standard` matching isn't limited to the provider that created the recording. Because Lumenfall can [route](/routing) the same model to different providers, a recording made against one can satisfy a request that routes to another, as long as both resolve to the same model. Same-provider recordings are preferred; a cross-provider match is used only when there's no same-provider hit.

This applies to `standard` only. `strict` and `pinned` stay within a single recording. When a cross-provider recording answers, the response carries `X-Lumenfall-Provider-Used` (whose recording served it) and, if the body was adapted to the requested provider's shape, `X-Lumenfall-Wire-Provider`.

## Latency simulation

By default a replay returns instantly. Set `X-Lumenfall-Replay-Latency` (per request, or as a key default) to make a replayed response take as long as the original did. Useful for testing loading states and timeouts.

| Value                 | Behavior                                                                                                                        |
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------- |
| `instant` *(default)* | Return immediately                                                                                                              |
| `real`                | Wait for the originally recorded duration before returning                                                                      |
| `{ttfb},{duration}`   | Split timing in milliseconds: delay before the response starts, then stream the body over the given duration (e.g. `1200,8000`) |

```bash theme={null}
curl https://api.lumenfall.ai/openai/v1/images/generations \
  -H "Authorization: Bearer $LUMENFALL_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Lumenfall-Replay: replay-or-error" \
  -H "X-Lumenfall-Replay-Latency: real" \
  -d '{ ... }'
```

<Note>
  `real` is capped at 60 seconds, a safety bound to match typical client timeouts. When the cap applies, the response carries `X-Lumenfall-Warning: LATENCY_CLAMPED`.
</Note>

## Knowing what happened

Every replay-engaged response carries headers describing what happened, so a test can assert it never silently called the provider.

| Header                      | Values                                         | Meaning                                                                                                           |
| --------------------------- | ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- |
| `X-Lumenfall-Replay-Result` | `replay` · `record` · `live` · `mock` · `miss` | What the replay layer did. Its **presence** means the request was handled by replay; its absence means it wasn't. |
| `X-Lumenfall-Replay-Match`  | `standard` · `strict` · `specific` · `pinned`  | Which match strategy resolved the request                                                                         |
| `X-Lumenfall-Recording-Id`  | `rec_…`                                        | The recording served or created (absent for `mock` and `live`)                                                    |
| `X-Lumenfall-Provider-Used` | provider name                                  | The provider whose recording served the request, differing from the routed provider on a cross-provider match     |
| `X-Lumenfall-Wire-Provider` | provider name                                  | Present when the recorded body was adapted to the requested provider's response shape                             |

The result values map 1:1 to the activations, so you can assert exactly what happened: a `replay-or-live` that hit reports `replay`; the same request on a miss reports `live`.

## Errors

| Scenario                                        | HTTP status | `X-Lumenfall-Replay-Result` | Error code                       |
| ----------------------------------------------- | ----------- | --------------------------- | -------------------------------- |
| `replay-or-error`, nothing matched              | 404         | `miss`                      | `RECORDING_NOT_FOUND`            |
| `pinned` without `X-Lumenfall-Replay-Recording` | 400         | n/a                         | `PINNED_MODE_REQUIRES_RECORDING` |
| `specific` without `X-Lumenfall-Replay-Fields`  | 400         | n/a                         | `SPECIFIC_MODE_REQUIRES_FIELDS`  |
| Invalid `X-Lumenfall-Replay-Latency` value      | 400         | n/a                         | `INVALID_LATENCY_POLICY`         |

A `replay-or-error` miss returns HTTP `404` with `X-Lumenfall-Replay-Result: miss`. Branch on that header, not the bare status. A provider's own "model not found" `404` won't carry it.

Only `replay-or-error` (and `pinned` / `specific` with a missing companion header) ever errors on a miss. `replay-or-mock`, `replay-or-live`, and `replay-or-record` always produce a response.
