Migrate from LiteLLM
The Ferro Labs AI Gateway is a drop-in replacement for LiteLLM Proxy. Because both products implement the OpenAI Chat Completions API, migration is usually a base-URL swap and a config file rewrite. As of v1.0.0, the gateway is a stable release with semver guarantees โ your config format and API surface are locked in.
For a detailed comparison of Ferro Labs vs LiteLLM and other alternatives, see Why Ferro Labs.
What changesโ
| Concept | LiteLLM | Ferro Labs AI Gateway |
|---|---|---|
| Proxy config | config.yaml with model_list | config.yaml with providers + targets |
| Provider credentials | LITELLM_* env vars | virtual_keys block or FERRO_* env vars |
| Routing | router_settings | strategy.mode |
| Fallback | fallbacks list | strategy: {mode: fallback} |
| Load balancing | routing_strategy: simple-shuffle | strategy: {mode: loadbalance} |
| Rate limiting | router_settings.rpm_limit | rate-limit plugin |
| Caching | Redis/S3 integrations | response-cache plugin |
| Request logging | LiteLLM callbacks | request-logger plugin |
| Spend tracking | SQLite/Postgres via LiteLLM UI | budget plugin |
Step 1 โ Start the gatewayโ
docker run -d -p 8080:8080 \
-v $(pwd)/config.yaml:/config.yaml \
ghcr.io/ferro-labs/ai-gateway:latest
Step 2 โ Rewrite your configโ
LiteLLM config (before)โ
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: "os.environ/OPENAI_API_KEY"
- model_name: claude-3-5-sonnet
litellm_params:
model: anthropic/claude-3-5-sonnet-20241022
api_key: "os.environ/ANTHROPIC_API_KEY"
router_settings:
routing_strategy: least-busy
fallbacks:
- gpt-4o: [claude-3-5-sonnet]
Ferro Labs AI Gateway config (after)โ
providers:
- key: openai
provider: openai
api_key: "${OPENAI_API_KEY}"
- key: anthropic
provider: anthropic
api_key: "${ANTHROPIC_API_KEY}"
strategy:
mode: fallback
targets:
- virtual_key: openai
retry:
attempts: 3
retry_on_status: [429, 502, 503, 504]
- virtual_key: anthropic
Step 3 โ Update the base URL in your codeโ
# Before (LiteLLM Proxy)
from openai import OpenAI
client = OpenAI(
api_key="sk-anything",
base_url="http://0.0.0.0:4000",
)
# After (Ferro Labs AI Gateway)
from openai import OpenAI
client = OpenAI(
api_key="sk-your-openai-key",
base_url="http://localhost:8080",
)
Model names stay the same โ gpt-4o, claude-3-5-sonnet-20241022, etc. No other code changes needed.
Step 4 โ Migrate pluginsโ
Rate limitingโ
# Before: LiteLLM router_settings.rpm_limit
router_settings:
rpm_limit: 100
# After: Ferro rate-limit plugin
plugins:
- name: rate-limit
type: ratelimit
stage: before_request
enabled: true
config:
requests_per_second: 1.67 # 100 rpm รท 60
burst: 20
Response cachingโ
# Before: LiteLLM Redis cache
litellm_settings:
cache: true
cache_params:
type: redis
host: redis
# After: Ferro in-memory cache (no Redis dependency)
plugins:
- name: response-cache
type: transform
stage: before_request
enabled: true
config:
max_age: 300
max_entries: 1000
Request loggingโ
# Before: LiteLLM callback
litellm_settings:
success_callback: ["langfuse"]
# After: Ferro request-logger plugin
plugins:
- name: request-logger
type: logging
stage: before_request
enabled: true
config:
level: info
persist: true
backend: sqlite
dsn: ferrogw-requests.db
Routing strategy equivalentsโ
LiteLLM routing_strategy | Ferro strategy.mode |
|---|---|
simple-shuffle | loadbalance |
least-busy | loadbalance |
latency-based-routing | least-latency |
cost-based-routing | cost-optimized |
| Fallbacks list | fallback |
Feature parity notesโ
- LiteLLM UI (spend dashboard, model management): Available in Ferro Labs Managed. The OSS gateway exposes an admin REST API for programmatic access.
- Virtual keys / teams: In Ferro Labs Managed. The OSS gateway supports API key passthrough via the
virtual_keysconfig block. - Spend tracking: The
budgetplugin provides in-memory per-key limits. Durable billing is in Ferro Labs Managed. - PII redaction, prompt injection detection: Ferro Labs Managed only. See OSS vs Ferro Labs Managed.