Skip to main content
Open Source · Apache 2.0 · v0.8.0

One Gateway for
Every AI Model

A high-performance, OpenAI-compatible proxy for 19 providers and 2,500+ models. Sub-millisecond routing, 11 built-in safety plugins, circuit breakers, and first-class MCP integration — self-hosted and production-ready.

19AI Providers
2,500+Models
11Plugins
6Routing Strategies
Start in 30 seconds
$docker run -p 8080:8080 -e OPENAI_API_KEY=sk-... ghcr.io/ferro-labs/ai-gateway:latest
Provider Support

19 providers out of the box

Credentials are registered via environment variables. Enable any provider in seconds — no code changes, no rebuilds.

OpenAIAnthropicGoogle GeminiMistralGroqCohereDeepSeekTogether AIPerplexityFireworks AIAI21Azure OpenAIAzure FoundryOllamaAWS BedrockReplicateVertex AIHugging FacexAI Grok

Up and running in minutes

Point any OpenAI-compatible client at http://localhost:8080 and the gateway handles provider credentials, routing, retries, and observability for you.

Use the same model name to switch providers transparently, or use conditional routing rules to send different models to different backends.

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="any",          # gateway manages provider creds
)

# Route to Anthropic — no SDK changes needed
response = client.chat.completions.create(
    model="claude-3-5-sonnet-20241022",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

# Use cost-optimized — gateway picks cheapest provider
response2 = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarize this."}],
)

Start routing AI traffic today

Open source under Apache 2.0. Self-host in minutes, scale to production, and never depend on a single LLM provider again.