Skip to main content
โš™ Open Source ยท Apache 2.0 ยท v1.0.0

One Gateway for
Every AI Model

A high-performance, OpenAI-compatible proxy for 29 providers and 2,500+ models โ€” built in Go for sub-millisecond overhead. 8 routing strategies, 11 safety plugins, circuit breakers, and first-class MCP integration โ€” self-hosted and production-ready.

29AI Providers
2,500+Models
11Plugins
8Routing Strategies
Start in 30 seconds
$docker run -p 8080:8080 -e OPENAI_API_KEY=sk-... ghcr.io/ferro-labs/ai-gateway:latest
Capabilities

Everything the production gateway needs

Built in Go for low latency and high concurrency. <1ms p99 overhead at 500 RPS in published benchmarks. Designed to be deployed in front of your LLM traffic without changing your existing client code.

Provider Support

29 providers out of the box

Credentials are registered via environment variables. Enable any provider in seconds โ€” no code changes, no rebuilds.

OpenAIAnthropicGoogle GeminiMistralGroqCohereDeepSeekTogether AIPerplexityFireworks AIAI21Azure OpenAIAzure FoundryOllamaAWS BedrockReplicateVertex AIHugging FacexAI GrokCerebrasNVIDIA NIMCloudflare Workers AIDatabricksNovita AIQwenMoonshot AISambaNovaDeepInfraOpenRouter
Managed Platform

Need multi-tenant, managed, and fully hosted?

Ferro Labs Managed wraps the open-source gateway engine with everything teams need to ship AI products.

โœ… Isolated per-tenant gateway instances โ€” each customer gets their own gateway with separate keys, limits, and logging
โœ… Dashboard + billing + analytics โ€” usage tracking, cost attribution, Stripe integration, and team management
โœ… SSO + audit logs + enterprise plugins โ€” SAML, PII redaction, prompt shield, secret scanning, and schema validation
Join Ferro Labs Managed Waitlist โ†’

Up and running in minutes

Point any OpenAI-compatible client at http://localhost:8080 and the gateway handles provider credentials, routing, retries, and observability for you.

Use the same model name to switch providers transparently, or use conditional routing rules to send different models to different backends.

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="any",          # gateway manages provider creds
)

# Route to Anthropic โ€” no SDK changes needed
response = client.chat.completions.create(
    model="claude-3-5-sonnet-20241022",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

# Use cost-optimized โ€” gateway picks cheapest provider
response2 = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarize this."}],
)

Start routing AI traffic today

Open source under Apache 2.0. Self-host in minutes, scale to production, and never depend on a single LLM provider again.