Skip to main content

Provider configuration

Providers are automatically registered when their required environment variables are present at startup. Set at least one provider key before starting the gateway.

Simple providers (API key only)โ€‹

These providers require a single API key:

export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
export GEMINI_API_KEY=...
export MISTRAL_API_KEY=...
export GROQ_API_KEY=gsk-...
export COHERE_API_KEY=...
export DEEPSEEK_API_KEY=...
export TOGETHER_API_KEY=...
export PERPLEXITY_API_KEY=pplx-...
export FIREWORKS_API_KEY=fw-...
export AI21_API_KEY=...
export XAI_API_KEY=xai-...
export HUGGING_FACE_API_KEY=hf-...
export CEREBRAS_API_KEY=...
export NVIDIA_NIM_API_KEY=...
export NOVITA_API_KEY=...
export QWEN_API_KEY=...
export MOONSHOT_API_KEY=...
export SAMBANOVA_API_KEY=...
export DEEPINFRA_API_KEY=...
export OPENROUTER_API_KEY=...

Azure OpenAIโ€‹

Requires endpoint, deployment name, and API version in addition to the key:

export AZURE_OPENAI_API_KEY=...
export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
export AZURE_OPENAI_DEPLOYMENT=gpt-4o
export AZURE_OPENAI_API_VERSION=2024-10-21

Azure AI Foundryโ€‹

export AZURE_FOUNDRY_API_KEY=...
export AZURE_FOUNDRY_ENDPOINT=https://your-project.services.ai.azure.com

Ollama (local / self-hosted)โ€‹

Ollama does not require an API key. The host defaults to http://localhost:11434.

export OLLAMA_HOST=http://localhost:11434
export OLLAMA_MODELS=llama3.2,llama3.1,mistral

OLLAMA_MODELS is a comma-separated list of model tags to expose at /v1/models.

Replicateโ€‹

export REPLICATE_API_TOKEN=r8_...
export REPLICATE_TEXT_MODELS=meta/llama-3-8b-instruct,mistralai/mistral-7b-instruct-v0.2
export REPLICATE_IMAGE_MODELS=stability-ai/sdxl

REPLICATE_TEXT_MODELS and REPLICATE_IMAGE_MODELS are comma-separated lists of Replicate model IDs to register.

AWS Bedrockโ€‹

Bedrock uses the standard AWS credential chain. Configure either AWS_REGION alone (for IAM roles / instance profiles) or full static credentials:

# Option 1 โ€” IAM role or instance profile
export AWS_REGION=us-east-1

# Option 2 โ€” static credentials
export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...

Google Vertex AIโ€‹

Vertex AI uses Application Default Credentials (ADC). Set the project ID and ensure ADC is configured:

export VERTEX_AI_PROJECT_ID=my-gcp-project
# ADC must be configured via gcloud or GOOGLE_APPLICATION_CREDENTIALS

Cloudflare Workers AIโ€‹

Requires both your Cloudflare account ID and an API token with Workers AI permissions:

export CLOUDFLARE_ACCOUNT_ID=...
export CLOUDFLARE_API_TOKEN=...

Databricksโ€‹

Use the workspace URL (without trailing slash) and a personal access token or service principal token:

export DATABRICKS_HOST=https://your-workspace.azuredatabricks.net
export DATABRICKS_TOKEN=dapi...

Cerebrasโ€‹

export CEREBRAS_API_KEY=...

NVIDIA NIMโ€‹

export NVIDIA_NIM_API_KEY=...

Novita AIโ€‹

export NOVITA_API_KEY=...

Qwen (Alibaba Cloud)โ€‹

export QWEN_API_KEY=...

Moonshot AIโ€‹

export MOONSHOT_API_KEY=...

SambaNovaโ€‹

export SAMBANOVA_API_KEY=...

DeepInfraโ€‹

export DEEPINFRA_API_KEY=...

OpenRouterโ€‹

export OPENROUTER_API_KEY=...

Validate configured providersโ€‹

After starting the gateway, confirm which providers are active:

# List all registered models (grouped by provider)
curl http://localhost:8080/v1/models

# Deep health check with per-provider status
curl http://localhost:8080/health

The /health endpoint returns 200 OK if at least one provider is reachable and includes per-provider latency in the JSON body.