Provider configuration
Providers are automatically registered when their required environment variables are present at startup. Set at least one provider key before starting the gateway.
Simple providers (API key only)โ
These providers require a single API key:
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
export GEMINI_API_KEY=...
export MISTRAL_API_KEY=...
export GROQ_API_KEY=gsk-...
export COHERE_API_KEY=...
export DEEPSEEK_API_KEY=...
export TOGETHER_API_KEY=...
export PERPLEXITY_API_KEY=pplx-...
export FIREWORKS_API_KEY=fw-...
export AI21_API_KEY=...
export XAI_API_KEY=xai-...
export HUGGING_FACE_API_KEY=hf-...
export CEREBRAS_API_KEY=...
export NVIDIA_NIM_API_KEY=...
export NOVITA_API_KEY=...
export QWEN_API_KEY=...
export MOONSHOT_API_KEY=...
export SAMBANOVA_API_KEY=...
export DEEPINFRA_API_KEY=...
export OPENROUTER_API_KEY=...
Azure OpenAIโ
Requires endpoint, deployment name, and API version in addition to the key:
export AZURE_OPENAI_API_KEY=...
export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
export AZURE_OPENAI_DEPLOYMENT=gpt-4o
export AZURE_OPENAI_API_VERSION=2024-10-21
Azure AI Foundryโ
export AZURE_FOUNDRY_API_KEY=...
export AZURE_FOUNDRY_ENDPOINT=https://your-project.services.ai.azure.com
Ollama (local / self-hosted)โ
Ollama does not require an API key. The host defaults to http://localhost:11434.
export OLLAMA_HOST=http://localhost:11434
export OLLAMA_MODELS=llama3.2,llama3.1,mistral
OLLAMA_MODELS is a comma-separated list of model tags to expose at /v1/models.
Replicateโ
export REPLICATE_API_TOKEN=r8_...
export REPLICATE_TEXT_MODELS=meta/llama-3-8b-instruct,mistralai/mistral-7b-instruct-v0.2
export REPLICATE_IMAGE_MODELS=stability-ai/sdxl
REPLICATE_TEXT_MODELS and REPLICATE_IMAGE_MODELS are comma-separated lists of Replicate model IDs to register.
AWS Bedrockโ
Bedrock uses the standard AWS credential chain. Configure either AWS_REGION alone (for IAM roles / instance profiles) or full static credentials:
# Option 1 โ IAM role or instance profile
export AWS_REGION=us-east-1
# Option 2 โ static credentials
export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...
Google Vertex AIโ
Vertex AI uses Application Default Credentials (ADC). Set the project ID and ensure ADC is configured:
export VERTEX_AI_PROJECT_ID=my-gcp-project
# ADC must be configured via gcloud or GOOGLE_APPLICATION_CREDENTIALS
Cloudflare Workers AIโ
Requires both your Cloudflare account ID and an API token with Workers AI permissions:
export CLOUDFLARE_ACCOUNT_ID=...
export CLOUDFLARE_API_TOKEN=...
Databricksโ
Use the workspace URL (without trailing slash) and a personal access token or service principal token:
export DATABRICKS_HOST=https://your-workspace.azuredatabricks.net
export DATABRICKS_TOKEN=dapi...
Cerebrasโ
export CEREBRAS_API_KEY=...
NVIDIA NIMโ
export NVIDIA_NIM_API_KEY=...
Novita AIโ
export NOVITA_API_KEY=...
Qwen (Alibaba Cloud)โ
export QWEN_API_KEY=...
Moonshot AIโ
export MOONSHOT_API_KEY=...
SambaNovaโ
export SAMBANOVA_API_KEY=...
DeepInfraโ
export DEEPINFRA_API_KEY=...
OpenRouterโ
export OPENROUTER_API_KEY=...
Validate configured providersโ
After starting the gateway, confirm which providers are active:
# List all registered models (grouped by provider)
curl http://localhost:8080/v1/models
# Deep health check with per-provider status
curl http://localhost:8080/health
The /health endpoint returns 200 OK if at least one provider is reachable and includes per-provider latency in the JSON body.