Python SDK Quickstart
Installationโ
pip install ferrolabsai
Authenticationโ
The client reads credentials from environment variables or constructor arguments:
export FERRO_API_KEY=sk-ferro-...
export FERRO_BASE_URL=http://localhost:8080 # default
Or pass them directly:
from ferrolabsai import FerroClient
client = FerroClient(
api_key="sk-ferro-...",
base_url="http://localhost:8080",
)
tip
The client also checks OPENAI_API_KEY as a fallback, so existing OpenAI key setups work out of the box.
First requestโ
from ferrolabsai import FerroClient
client = FerroClient()
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello from Ferro Labs!"}],
)
print(response.content) # shortcut to choices[0].message.content
print(f"Provider: {response.provider}")
print(f"Cost: ${response.usage.cost_usd:.6f}")
print(f"Trace ID: {response.trace_id}")
The response includes Ferro-specific fields โ provider, trace_id, latency_ms, and usage.cost_usd โ alongside the standard OpenAI response shape.
Streamingโ
stream = client.chat.completions.create(
model="claude-3-5-sonnet-20241022",
messages=[{"role": "user", "content": "Write a haiku about AI gateways"}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
Embeddingsโ
response = client.embeddings.create(
model="text-embedding-3-small",
input=["Ferro routes LLM requests", "across 29 providers"],
)
for item in response.data:
print(f"Dimension: {len(item.embedding)}")
Image generationโ
response = client.images.generate(
model="dall-e-3",
prompt="A futuristic AI gateway routing requests across the cosmos",
size="1024x1024",
)
print(response.data[0].url)
Model discoveryโ
# List all models from a specific provider
models = client.models.list(provider="anthropic")
for m in models:
print(f"{m.id} โ {m.context_window:,} tokens")
# Search the catalog
results = client.models.search("embedding")
Ferro-specific featuresโ
The SDK supports gateway-specific parameters that the generic OpenAI SDK cannot access:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Summarize this document"}],
template_id="summarizer-v2", # server-side prompt template
template_variables={"tone": "formal"}, # template variables
route_tag="premium", # override routing strategy
)
Context managerโ
with FerroClient() as client:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
)
print(response.content)
# httpx client is closed automatically
Next stepsโ
- API Reference โ Full method signatures and response types
- Async Usage โ AsyncFerroClient for async/await workflows
- Error Handling โ Exception hierarchy and retry behavior
- Configuration โ Gateway-side routing and plugin setup