Skip to main content
Lumenfall automatically routes your requests to the best available AI provider based on the model you specify and provider availability. Routing is managed by Lumenfall to ensure optimal reliability and performance.

How routing works

When you make a request to Lumenfall:
  1. Model matching — Lumenfall matches your requested model to available providers using routing rules
  2. Provider selection — The system selects providers based on priority and weight
  3. Execution — Your request is sent to the selected provider
  4. Failover — If a provider fails, the request automatically routes to the next available provider
Request → Model Matching → Provider Selection → Execute → Response

                              (on failure)

                            Next Provider → Execute → Response

Priority groups

Providers are organized into priority groups. Lumenfall tries providers in the highest priority group first (priority 0), then falls back to lower priority groups if needed.
Routing configuration is currently managed by Lumenfall. If you need a specific routing setup, contact support.
For example, a routing configuration might look like:
ProviderPriorityWeight
Google Vertex0100%
OpenAI170%
Replicate130%
In this setup:
  • Google Vertex is always tried first (priority 0)
  • If Google Vertex fails, the request falls back to priority group 1
  • Within priority group 1, 70% of requests go to OpenAI and 30% to Replicate

Weighted load balancing

Within a priority group, Lumenfall can split traffic between providers using weights. This enables:
  • Cost optimization — Routing more traffic to cheaper providers
  • Capacity management — Distributing load across providers
  • Quality optimization — Favoring providers with better output for specific models

Model pattern matching

Routing rules use glob patterns to match model names:
PatternMatches
imagen-3Exact match for imagen-3
imagen-*Any model starting with imagen-
dall-e-*Any model starting with dall-e-
*Any model (catch-all)

Forcing a specific provider

While routing is managed by Lumenfall, you can bypass it and force a specific provider by prefixing the model name:
curl https://api.lumenfall.ai/openai/v1/images/generations \
  -H "Authorization: Bearer $LUMENFALL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vertex/imagen-3",
    "prompt": "A mountain landscape"
  }'
Supported prefixes:
  • vertex/ — Google Vertex AI
  • openai/ — OpenAI
  • replicate/ — Replicate

Automatic failover

If a provider returns an error or times out, Lumenfall automatically:
  1. Logs the failure with timing information
  2. Selects the next provider based on priority/weight
  3. Retries the request
  4. Returns the result from the successful provider
This happens transparently—you always receive a successful response if any provider in the chain succeeds.

Viewing routing decisions

Each response includes headers showing which provider handled the request:
X-Lumenfall-Provider: vertex
X-Lumenfall-Model: imagen-3
X-Lumenfall-Request-Id: req_abc123
Use these headers for debugging and monitoring your provider distribution.