Skip to main content

MCP integration

Model Context Protocol (MCP) integration was added in v0.8.0 and extended with streaming support in v1.0.0. When configured, the gateway connects to your MCP tool servers, injects available tools into chat completion requests, and runs the full agentic loop โ€” so clients can receive a final text answer without implementing tool-calling logic themselves.

How it worksโ€‹

The agentic loop runs inside the gateway. Your client sends a standard chat completion request and receives the final text answer. The intermediate tool calls are transparent.

Configurationโ€‹

Add mcp_servers to your config.yaml:

mcp_servers:
- name: filesystem
url: "http://localhost:3001/mcp"
timeout_seconds: 10
max_call_depth: 3

- name: database
url: "https://mcp-db.internal/mcp"
headers:
Authorization: "Bearer ${MCP_DB_TOKEN}"
allowed_tools:
- query_readonly
- list_tables
timeout_seconds: 15
max_call_depth: 5

Configuration fieldsโ€‹

FieldRequiredDefaultDescription
nameYesโ€”Unique name for this MCP server
urlYesโ€”HTTP endpoint of the MCP server (Streamable HTTP transport)
headersNo{}HTTP headers to include (supports ${ENV_VAR} interpolation)
allowed_toolsNoall toolsIf set, only these tool names are injected and callable
timeout_secondsNo10Per-tool-call HTTP timeout
max_call_depthNo3Maximum number of tool call rounds per request

Startup behaviourโ€‹

On gateway.New(), MCP connections are initialised in a background goroutine with a 60-second timeout. The gateway is ready to serve requests immediately โ€” MCP tool injection begins once the background init completes. You can call gateway.MCPInitDone() to get a channel that closes when initialisation is finished.

Authenticationโ€‹

MCP servers that require auth can receive credentials via the headers field. Environment variable interpolation (${VAR}) is supported so secrets are never hardcoded in config files:

mcp_servers:
- name: secure-tools
url: "https://tools.internal/mcp"
headers:
Authorization: "Bearer ${MCP_TOOLS_TOKEN}"
X-Tenant-ID: "acme-corp"

Tool access controlโ€‹

Use allowed_tools to restrict which tools from an MCP server are exposed to the model. This is useful for read-only server access or capability scoping:

mcp_servers:
- name: database
url: "https://mcp-db.internal/mcp"
allowed_tools:
- query_readonly
- list_tables
# write/delete tools from this server are NOT injected

Prompt injection riskโ€‹

Security note

Allowing the model to execute arbitrary tool calls introduces risk. Always:

  • Use allowed_tools to whitelist only the tools the model needs
  • Prefer read-only tools where possible
  • Set max_call_depth conservatively (3โ€“5 is usually sufficient)
  • Validate and sanitise all data before it reaches write-capable tools

Compatible MCP serversโ€‹

The gateway supports MCP servers that implement the 2025-11-25 Streamable HTTP transport. Popular compatible servers include:

Testing the connectionโ€‹

After starting the gateway with mcp_servers configured, verify tools are loaded:

# The gateway logs MCP tool discovery on startup
docker logs ferrogw 2>&1 | grep mcp

# Or check the full tool list via the admin API
curl -H "Authorization: Bearer $ADMIN_API_KEY" \
http://localhost:8080/admin/mcp/tools

Streaming requestsโ€‹

Since v1.0.0-rc.1, clients may send stream: true when MCP servers are configured. The gateway transparently redirects streaming requests through the full agentic loop โ€” all tool calls are resolved inside the gateway, and the final text answer is returned as a single-chunk stream response. Clients receive correct SSE output without needing to handle intermediate tool-call messages.

# stream: true works โ€” final answer is delivered via SSE after the tool loop completes
response = client.chat.completions.create(
model="claude-3-5-sonnet-20241022",
stream=True,
messages=[{"role": "user", "content": "List the failing tests in gateway_test.go"}],
)
for chunk in response:
print(chunk.choices[0].delta.content or "", end="")

Example: filesystem toolsโ€‹

Start a local filesystem MCP server and connect it to the gateway:

# 1. Start the MCP filesystem server
npx @modelcontextprotocol/server-filesystem /path/to/workspace

# 2. config.yaml
mcp_servers:
- name: filesystem
url: "http://localhost:3000/mcp"
allowed_tools: [read_file, list_directory, search_files]
max_call_depth: 4

Then ask the model a question that requires reading files:

response = client.chat.completions.create(
model="claude-3-5-sonnet-20241022",
messages=[{
"role": "user",
"content": "What tests are failing in the src/gateway_test.go file?"
}],
)
# The gateway reads the file via MCP, sends content to Claude, returns the answer
print(response.choices[0].message.content)