Skip to main content

Async Usage

The AsyncFerroClient provides the same API as FerroClient using httpx.AsyncClient under the hood. Use it in async frameworks like FastAPI, Starlette, or any asyncio application.

Setupโ€‹

from ferrolabsai import AsyncFerroClient

client = AsyncFerroClient(
api_key="sk-ferro-...",
base_url="http://localhost:8080",
)

Basic requestโ€‹

response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello from async!"}],
)
print(response.content)

Async streamingโ€‹

stream = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Count from 1 to 10"}],
stream=True,
)

async for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)

Async context managerโ€‹

async with AsyncFerroClient() as client:
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
)
print(response.content)
# httpx.AsyncClient is closed automatically

Async embeddingsโ€‹

response = await client.embeddings.create(
model="text-embedding-3-small",
input=["async embedding request"],
)
print(len(response.data[0].embedding))

FastAPI exampleโ€‹

from fastapi import FastAPI
from ferrolabsai import AsyncFerroClient

app = FastAPI()
client = AsyncFerroClient()

@app.post("/chat")
async def chat(message: str):
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": message}],
)
return {
"reply": response.content,
"provider": response.provider,
"cost": response.usage.cost_usd,
}

Parallel requestsโ€‹

import asyncio
from ferrolabsai import AsyncFerroClient

async def main():
async with AsyncFerroClient() as client:
tasks = [
client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": f"What is {i} * {i}?"}],
)
for i in range(5)
]
results = await asyncio.gather(*tasks)

for r in results:
print(f"{r.content} (via {r.provider})")

asyncio.run(main())

Current async coverageโ€‹

ResourceAsync support
chat.completionsโœ… Full (including streaming)
embeddingsโœ… Full
imagesโณ Coming soon (uses sync internally)
modelsโณ Coming soon (uses sync internally)
adminโณ Coming soon (uses sync internally)
info

Full async coverage for images, models, and admin is planned for v0.2.0. In the meantime, these namespaces work but use synchronous HTTP under the hood.

Next stepsโ€‹