Provider configuration
Providers are automatically registered when their required environment variables are present at startup. Set at least one provider key before starting the gateway.
Simple providers (API key only)
These providers require a single API key:
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
export GEMINI_API_KEY=...
export MISTRAL_API_KEY=...
export GROQ_API_KEY=gsk-...
export COHERE_API_KEY=...
export DEEPSEEK_API_KEY=...
export TOGETHER_API_KEY=...
export PERPLEXITY_API_KEY=pplx-...
export FIREWORKS_API_KEY=fw-...
export AI21_API_KEY=...
export XAI_API_KEY=xai-...
export HUGGING_FACE_API_KEY=hf-...
Azure OpenAI
Requires endpoint, deployment name, and API version in addition to the key:
export AZURE_OPENAI_API_KEY=...
export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
export AZURE_OPENAI_DEPLOYMENT=gpt-4o
export AZURE_OPENAI_API_VERSION=2024-10-21
Azure AI Foundry
export AZURE_FOUNDRY_API_KEY=...
export AZURE_FOUNDRY_ENDPOINT=https://your-project.services.ai.azure.com
Ollama (local / self-hosted)
Ollama does not require an API key. The host defaults to http://localhost:11434.
export OLLAMA_HOST=http://localhost:11434
export OLLAMA_MODELS=llama3.2,llama3.1,mistral
OLLAMA_MODELS is a comma-separated list of model tags to expose at /v1/models.
Replicate
export REPLICATE_API_TOKEN=r8_...
export REPLICATE_TEXT_MODELS=meta/llama-3-8b-instruct,mistralai/mistral-7b-instruct-v0.2
export REPLICATE_IMAGE_MODELS=stability-ai/sdxl
REPLICATE_TEXT_MODELS and REPLICATE_IMAGE_MODELS are comma-separated lists of Replicate model IDs to register.
AWS Bedrock
Bedrock uses the standard AWS credential chain. Configure either AWS_REGION alone (for IAM roles / instance profiles) or full static credentials:
# Option 1 — IAM role or instance profile
export AWS_REGION=us-east-1
# Option 2 — static credentials
export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...
Google Vertex AI
Vertex AI uses Application Default Credentials (ADC). Set the project ID and ensure ADC is configured:
export VERTEX_AI_PROJECT_ID=my-gcp-project
# ADC must be configured via gcloud or GOOGLE_APPLICATION_CREDENTIALS
Validate configured providers
After starting the gateway, confirm which providers are active:
# List all registered models (grouped by provider)
curl http://localhost:8080/v1/models
# Deep health check with per-provider status
curl http://localhost:8080/health
The /health endpoint returns 200 OK if at least one provider is reachable and includes per-provider latency in the JSON body.