Skip to main content

Migrate from LiteLLM

The Ferro Labs AI Gateway is a drop-in replacement for LiteLLM Proxy. Because both products implement the OpenAI Chat Completions API, migration is usually a base-URL swap and a config file rewrite. As of v1.0.0, the gateway is a stable release with semver guarantees โ€” your config format and API surface are locked in.

For a detailed comparison of Ferro Labs vs LiteLLM and other alternatives, see Why Ferro Labs.

What changesโ€‹

ConceptLiteLLMFerro Labs AI Gateway
Proxy configconfig.yaml with model_listconfig.yaml with providers + targets
Provider credentialsLITELLM_* env varsvirtual_keys block or FERRO_* env vars
Routingrouter_settingsstrategy.mode
Fallbackfallbacks liststrategy: {mode: fallback}
Load balancingrouting_strategy: simple-shufflestrategy: {mode: loadbalance}
Rate limitingrouter_settings.rpm_limitrate-limit plugin
CachingRedis/S3 integrationsresponse-cache plugin
Request loggingLiteLLM callbacksrequest-logger plugin
Spend trackingSQLite/Postgres via LiteLLM UIbudget plugin

Step 1 โ€” Start the gatewayโ€‹

docker run -d -p 8080:8080 \
-v $(pwd)/config.yaml:/config.yaml \
ghcr.io/ferro-labs/ai-gateway:latest

Step 2 โ€” Rewrite your configโ€‹

LiteLLM config (before)โ€‹

model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: "os.environ/OPENAI_API_KEY"

- model_name: claude-3-5-sonnet
litellm_params:
model: anthropic/claude-3-5-sonnet-20241022
api_key: "os.environ/ANTHROPIC_API_KEY"

router_settings:
routing_strategy: least-busy
fallbacks:
- gpt-4o: [claude-3-5-sonnet]

Ferro Labs AI Gateway config (after)โ€‹

providers:
- key: openai
provider: openai
api_key: "${OPENAI_API_KEY}"

- key: anthropic
provider: anthropic
api_key: "${ANTHROPIC_API_KEY}"

strategy:
mode: fallback

targets:
- virtual_key: openai
retry:
attempts: 3
retry_on_status: [429, 502, 503, 504]
- virtual_key: anthropic

Step 3 โ€” Update the base URL in your codeโ€‹

# Before (LiteLLM Proxy)
from openai import OpenAI
client = OpenAI(
api_key="sk-anything",
base_url="http://0.0.0.0:4000",
)

# After (Ferro Labs AI Gateway)
from openai import OpenAI
client = OpenAI(
api_key="sk-your-openai-key",
base_url="http://localhost:8080",
)

Model names stay the same โ€” gpt-4o, claude-3-5-sonnet-20241022, etc. No other code changes needed.

Step 4 โ€” Migrate pluginsโ€‹

Rate limitingโ€‹

# Before: LiteLLM router_settings.rpm_limit
router_settings:
rpm_limit: 100

# After: Ferro rate-limit plugin
plugins:
- name: rate-limit
type: ratelimit
stage: before_request
enabled: true
config:
requests_per_second: 1.67 # 100 rpm รท 60
burst: 20

Response cachingโ€‹

# Before: LiteLLM Redis cache
litellm_settings:
cache: true
cache_params:
type: redis
host: redis

# After: Ferro in-memory cache (no Redis dependency)
plugins:
- name: response-cache
type: transform
stage: before_request
enabled: true
config:
max_age: 300
max_entries: 1000

Request loggingโ€‹

# Before: LiteLLM callback
litellm_settings:
success_callback: ["langfuse"]

# After: Ferro request-logger plugin
plugins:
- name: request-logger
type: logging
stage: before_request
enabled: true
config:
level: info
persist: true
backend: sqlite
dsn: ferrogw-requests.db

Routing strategy equivalentsโ€‹

LiteLLM routing_strategyFerro strategy.mode
simple-shuffleloadbalance
least-busyloadbalance
latency-based-routingleast-latency
cost-based-routingcost-optimized
Fallbacks listfallback

Feature parity notesโ€‹

  • LiteLLM UI (spend dashboard, model management): Available in Ferro Labs Managed. The OSS gateway exposes an admin REST API for programmatic access.
  • Virtual keys / teams: In Ferro Labs Managed. The OSS gateway supports API key passthrough via the virtual_keys config block.
  • Spend tracking: The budget plugin provides in-memory per-key limits. Durable billing is in Ferro Labs Managed.
  • PII redaction, prompt injection detection: Ferro Labs Managed only. See OSS vs Ferro Labs Managed.