Skip to main content

Ferro Labs AI Gateway

Ferro Labs AI Gateway is an open-source, high-performance control plane for AI traffic. It exposes a single OpenAI-compatible API that routes requests across 19 providers and 2,500+ models, enforces safety policies with 11 built-in plugins, and emits full production observability — all without changing your existing client code.

What is it?

Drop the gateway in front of your LLM traffic. Set base_url to the gateway endpoint. That's it — your OpenAI SDK, LangChain, LlamaIndex, or curl commands continue to work unchanged.

docker run -d -p 8080:8080 \
-e OPENAI_API_KEY=sk-... \
-e ANTHROPIC_API_KEY=sk-ant-... \
ghcr.io/ferro-labs/ai-gateway:latest

Then send requests to http://localhost:8080/v1/chat/completions exactly as you would to OpenAI.

Key capabilities

CapabilityDetails
19 AI providersOpenAI, Anthropic, Gemini, Mistral, Groq, Cohere, DeepSeek, Together, Perplexity, Fireworks, AI21, Azure OpenAI, Azure Foundry, xAI, Ollama, Replicate, AWS Bedrock, Vertex AI, Hugging Face
6 routing strategiesSingle, Fallback, Weighted, Conditional, Least-Latency, Cost-Optimized
11 built-in pluginsWord filter, max-token, response cache, request logger, rate limit, PII redact, secret scan, prompt shield, schema guard, regex guard
MCP integrationAgentic tool-calling loop via Model Context Protocol servers (v0.8.0+)
ObservabilityPrometheus metrics, structured JSON logs with trace IDs, deep /health per provider
ResiliencyPer-target circuit breakers, retry with exponential backoff, per-status-code retry config
OpenAI compatibleChat completions, embeddings, images, and model listing — same wire format

Docs map

Getting started

Guides

Operations & reference