Providers

The gateway supports 30 AI providers. A provider becomes active when its required environment variables are set — no code changes or rebuilds needed.

All providers

Provider	Virtual Key	Required Environment Variables
OpenAI	`openai`	`OPENAI_API_KEY`
Anthropic	`anthropic`	`ANTHROPIC_API_KEY`
Google Gemini	`gemini`	`GEMINI_API_KEY`
Mistral	`mistral`	`MISTRAL_API_KEY`
Groq	`groq`	`GROQ_API_KEY`
Cohere	`cohere`	`COHERE_API_KEY`
DeepSeek	`deepseek`	`DEEPSEEK_API_KEY`
Together AI	`together`	`TOGETHER_API_KEY`
Perplexity	`perplexity`	`PERPLEXITY_API_KEY`
Fireworks AI	`fireworks`	`FIREWORKS_API_KEY`
AI21	`ai21`	`AI21_API_KEY`
xAI (Grok)	`xai`	`XAI_API_KEY`
Azure OpenAI	`azure-openai`	`AZURE_OPENAI_API_KEY` + endpoint + deployment
Azure Foundry	`azure-foundry`	`AZURE_FOUNDRY_API_KEY` + `AZURE_FOUNDRY_ENDPOINT`
Ollama	`ollama`	`OLLAMA_HOST` (no API key required)
Ollama Cloud	`ollama-cloud`	`OLLAMA_API_KEY`
AWS Bedrock	`bedrock`	`AWS_REGION` or `AWS_ACCESS_KEY_ID`
Replicate	`replicate`	`REPLICATE_API_TOKEN`
Vertex AI	`vertex-ai`	`VERTEX_AI_PROJECT_ID` + `VERTEX_AI_REGION` + (`VERTEX_AI_API_KEY` or `VERTEX_AI_SERVICE_ACCOUNT_JSON`)
Hugging Face	`hugging-face`	`HUGGING_FACE_API_KEY`
Cerebras	`cerebras`	`CEREBRAS_API_KEY`
NVIDIA NIM	`nvidia-nim`	`NVIDIA_NIM_API_KEY`
Cloudflare Workers AI	`cloudflare`	`CLOUDFLARE_ACCOUNT_ID` + `CLOUDFLARE_API_KEY`
Databricks	`databricks`	`DATABRICKS_HOST` + `DATABRICKS_TOKEN`
Novita AI	`novita`	`NOVITA_API_KEY`
Qwen (Alibaba)	`qwen`	`QWEN_API_KEY`
Moonshot AI	`moonshot`	`MOONSHOT_API_KEY`
SambaNova	`sambanova`	`SAMBANOVA_API_KEY`
DeepInfra	`deepinfra`	`DEEPINFRA_API_KEY`
OpenRouter	`openrouter`	`OPENROUTER_API_KEY`

Provider capabilities

All providers support chat completions and streaming. Capability support varies by provider:

Embeddings — AWS Bedrock, Cloudflare Workers AI, Cohere, Databricks, Fireworks AI, Google Gemini, Hugging Face, Mistral, Novita AI, OpenAI, Together AI
Image generation — OpenAI (DALL·E), Hugging Face, Replicate
Local / self-hosted — Ollama
Managed cloud inference — AWS Bedrock, Vertex AI, Azure Foundry, Databricks
High-speed inference — Cerebras, SambaNova, Groq
Model aggregators — OpenRouter, Novita AI (access hundreds of models via one key)

Provider selection at runtime

The gateway selects a provider using the configured routing strategy. You can also force a specific provider for a single request using the X-Provider request header:

curl http://localhost:8080/v1/chat/completions \
  -H "X-Provider: anthropic" \
  -H "Content-Type: application/json" \
  -d '{"model": "claude-3-5-sonnet-20241022", "messages": [{"role": "user", "content": "Hi"}]}'

If X-Provider is set, the routing strategy is bypassed for that request.

Model catalog

The gateway ships with a built-in catalog of 2,500+ model entries used for cost estimation and the /v1/models response. Run GET /v1/models to see all available models given your configured providers.

Next steps

Provider configuration — environment variables for each provider
Routing policies — how to route across providers

All providers​

Provider capabilities​

Provider selection at runtime​

Model catalog​

Next steps​

All providers

Provider capabilities

Provider selection at runtime

Model catalog

Next steps