OpenAI-Compatible SDKs
You can keep your existing OpenAI SDK and only change the base URL and API key. The gateway is a drop-in replacement for the OpenAI API โ all models, routing, and plugins are transparent to the client.
JavaScript / TypeScriptโ
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
baseURL: "http://localhost:8080",
});
const response = await client.chat.completions.create({
model: "gpt-4o-mini",
messages: [{ role: "user", content: "Hello from the gateway!" }],
});
console.log(response.choices[0].message.content);
Pythonโ
from openai import OpenAI
client = OpenAI(
api_key="sk-your-key",
base_url="http://localhost:8080",
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello from the gateway!"}],
)
print(response.choices[0].message.content)
Goโ
package main
import (
"context"
"fmt"
"github.com/sashabaranov/go-openai"
)
func main() {
cfg := openai.DefaultConfig("sk-your-key")
cfg.BaseURL = "http://localhost:8080/v1"
client := openai.NewClientWithConfig(cfg)
resp, err := client.CreateChatCompletion(context.Background(),
openai.ChatCompletionRequest{
Model: "gpt-4o-mini",
Messages: []openai.ChatCompletionMessage{
{Role: openai.ChatMessageRoleUser, Content: "Hello from the gateway!"},
},
},
)
if err != nil {
panic(err)
}
fmt.Println(resp.Choices[0].Message.Content)
}
curlโ
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-your-key" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello from the gateway!"}]
}'
LangChain (Python)โ
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-4o-mini",
openai_api_key="sk-your-key",
openai_api_base="http://localhost:8080/v1",
)
response = llm.invoke("What is the Ferro Labs AI Gateway?")
print(response.content)
LlamaIndex (Python)โ
from llama_index.llms.openai import OpenAI as LlamaOpenAI
from llama_index.core import Settings
Settings.llm = LlamaOpenAI(
model="gpt-4o-mini",
api_key="sk-your-key",
api_base="http://localhost:8080/v1",
)
Streamingโ
All SDKs support streaming โ set stream=True / stream: true as normal. Streaming works through all routing strategies and plugins. When MCP tool servers are configured, the gateway runs the full agentic loop and returns the final answer as a single-chunk SSE stream. See MCP integration for details.
stream = client.chat.completions.create(
model="gpt-4o-mini",
stream=True,
messages=[{"role": "user", "content": "Count from 1 to 5."}],
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
Native SDKsโ
For typed access to Ferro-specific features (trace IDs, cost tracking, admin API, prompt templates), use a native SDK:
- Python SDK (ferrolabsai) โ Sync and async clients with streaming, embeddings, images, models, and admin API
- Go SDK โ Embed the gateway as a library, write custom plugins, extract trace IDs