Request lifecycle
This page explains the full lifecycle from client request to provider response, including retries and fallback.
Lifecycle stages
- Ingress: Gateway receives an OpenAI-compatible request.
- Validation: Authentication and request shape checks run.
- Policy checks: Rate limits, plugins, and controls are applied.
- Routing decision: Strategy picks primary provider/model.
- Execution: Request is sent upstream.
- Recovery: Retry/fallback runs on eligible failures.
- Egress: Response and metadata are returned to client.
- Telemetry: Logs and metrics are emitted.
Detailed sequence
Error handling state flow
Practical notes
- Keep retries bounded to avoid amplifying upstream incidents.
- Prefer fallback across providers, not only models in one provider.
- Track per-stage latency to spot bottlenecks early.