Python SDK Quickstart

Installation

pip install ferrolabsai

Authentication

The client reads credentials from environment variables or constructor arguments:

export FERRO_API_KEY=sk-ferro-...
export FERRO_BASE_URL=http://localhost:8080   # default

Or pass them directly:

from ferrolabsai import FerroClient

client = FerroClient(
    api_key="sk-ferro-...",
    base_url="http://localhost:8080",
)

tip

The client also checks OPENAI_API_KEY as a fallback, so existing OpenAI key setups work out of the box.

First request

from ferrolabsai import FerroClient

client = FerroClient()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello from Ferro Labs!"}],
)

print(response.content)  # shortcut to choices[0].message.content
print(f"Provider: {response.provider}")
print(f"Cost: ${response.usage.cost_usd:.6f}")
print(f"Trace ID: {response.trace_id}")

The response includes Ferro-specific fields — provider, trace_id, latency_ms, and usage.cost_usd — alongside the standard OpenAI response shape.

Streaming

stream = client.chat.completions.create(
    model="claude-3-5-sonnet-20241022",
    messages=[{"role": "user", "content": "Write a haiku about AI gateways"}],
    stream=True,
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Embeddings

response = client.embeddings.create(
    model="text-embedding-3-small",
    input=["Ferro routes LLM requests", "across 29 providers"],
)

for item in response.data:
    print(f"Dimension: {len(item.embedding)}")

Image generation

response = client.images.generate(
    model="dall-e-3",
    prompt="A futuristic AI gateway routing requests across the cosmos",
    size="1024x1024",
)

print(response.data[0].url)

Model discovery

# List all models from a specific provider
models = client.models.list(provider="anthropic")
for m in models:
    print(f"{m.id} — {m.context_window:,} tokens")

# Search the catalog
results = client.models.search("embedding")

Ferro-specific features

The SDK supports gateway-specific parameters that the generic OpenAI SDK cannot access:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarize this document"}],
    template_id="summarizer-v2",           # server-side prompt template
    template_variables={"tone": "formal"}, # template variables
    route_tag="premium",                   # override routing strategy
)

Context manager

with FerroClient() as client:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello"}],
    )
    print(response.content)
# httpx client is closed automatically

Next steps

API Reference — Full method signatures and response types
Async Usage — AsyncFerroClient for async/await workflows
Error Handling — Exception hierarchy and retry behavior
Configuration — Gateway-side routing and plugin setup

Installation​

Authentication​

First request​

Streaming​

Embeddings​

Image generation​

Model discovery​

Ferro-specific features​

Context manager​

Next steps​