AI Gateway
OpenAI-compatible AI gateway built into Temps. Use any OpenAI SDK to call OpenAI, Anthropic, Gemini, xAI, Mistral, and DeepSeek — just change the base URL.
Overview
The AI Gateway provides a unified, OpenAI-compatible API for all major LLM providers. Your existing OpenAI SDK code works unchanged — just point it at your Temps instance.
What's Included
- OpenAI API wire-compatible endpoints
- Multi-provider routing (OpenAI, Anthropic, Gemini, xAI, Mistral, DeepSeek)
- Tool calling / function calling across all providers
- Vision / image attachments (base64 and URL)
- SSE streaming with consistent format
- Centralized API key management
- Per-user usage tracking and analytics
- JSON mode / structured output
Why It Matters
- Zero code changes — works with any OpenAI SDK
- Centralized key management (devs never see API keys)
- Switch providers without changing client code
- Built-in usage analytics and cost tracking
- Self-hosted — your data stays on your infrastructure
- No per-request fees beyond provider costs
- Transparent translation for Anthropic and Gemini
Key Features
- Name
Drop-In Replacement- Description
100% OpenAI SDK compatible. Works with Python, Node.js, Go, Rust, and any other OpenAI client library.
- Name
Transparent Translation- Description
Automatically translates requests/responses for Anthropic Messages API and Gemini generateContent API.
- Name
Vision Support- Description
Send images as base64 data URIs or HTTP URLs. Translated to each provider's native format automatically.
- Name
Tool Calling- Description
OpenAI-format tools work across all providers. Function definitions, tool_choice, and tool results are translated transparently.
- Name
Streaming- Description
Server-Sent Events (SSE) streaming with consistent
data: {...}\n\nformat across all providers.
- Name
Usage Analytics- Description
Track token usage, costs, and request counts per user, model, and provider.
Quick Start
Point any OpenAI SDK at your Temps instance. The only changes are base_url and api_key.
Setup
Supported Providers
The gateway routes requests to the correct provider based on the model name:
| Model Prefix | Provider | Examples |
|---|---|---|
gpt-*, o1-*, o3-* | OpenAI | gpt-4o, gpt-4o-mini, o3-mini |
claude-* | Anthropic | claude-sonnet-4-20250514, claude-haiku-4-5-20251001 |
gemini-* | Google Gemini | gemini-2.5-pro, gemini-2.5-flash |
grok-* | xAI | grok-3, grok-3-mini |
mistral-*, codestral-* | Mistral | mistral-large-latest, codestral-latest |
deepseek-* | DeepSeek | deepseek-chat, deepseek-reasoner |
All providers use the same OpenAI-compatible API format. The gateway handles protocol translation automatically for Anthropic and Gemini.
Chat Completions
Standard chat completions work identically to the OpenAI API.
Multi-turn conversation
response = client.chat.completions.create(
model="claude-sonnet-4-20250514",
messages=[
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write a Python function to calculate fibonacci numbers."},
{"role": "assistant", "content": "Here's a fibonacci function:\n\n```python\ndef fib(n):\n if n <= 1:\n return n\n return fib(n-1) + fib(n-2)\n```"},
{"role": "user", "content": "Can you make it iterative for better performance?"},
],
temperature=0.7,
max_tokens=500,
)
Vision & Image Attachments
Send images to vision-capable models. The gateway translates image formats automatically for each provider.
Base64 Image
Send base64 image
import base64
# Read image and encode to base64
with open("screenshot.png", "rb") as f:
image_data = base64.standard_b64encode(f.read()).decode("utf-8")
response = client.chat.completions.create(
model="gpt-4o", # or claude-sonnet-4-20250514, gemini-2.5-pro
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image? Describe it in detail."},
{
"type": "image_url",
"image_url": {
"url": f"data:image/png;base64,{image_data}"
},
},
],
}
],
max_tokens=1000,
)
print(response.choices[0].message.content)
Image from URL
Image URL
response = client.chat.completions.create(
model="gemini-2.5-pro",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Describe what you see in this image."},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/photo.jpg"
},
},
],
}
],
)
Multiple Images
Multiple images
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Compare these two screenshots. What changed?"},
{
"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{before_image}"},
},
{
"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{after_image}"},
},
],
}
],
)
Supported image formats: PNG, JPEG, GIF, WebP. Base64 data URIs work with all providers. HTTP URLs work with OpenAI and Gemini; Anthropic requires base64.
Tool Calling (Function Calling)
Define tools using the OpenAI format. The gateway translates them to each provider's native format.
Tool calling
import json
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, e.g. 'London'"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["location"]
}
}
}
]
# Step 1: Send message with tools
response = client.chat.completions.create(
model="claude-sonnet-4-20250514", # works with any provider
messages=[
{"role": "user", "content": "What's the weather like in London?"}
],
tools=tools,
)
message = response.choices[0].message
# Step 2: Check if the model wants to call a tool
if message.tool_calls:
tool_call = message.tool_calls[0]
args = json.loads(tool_call.function.arguments)
print(f"Model wants to call: {tool_call.function.name}({args})")
# Step 3: Execute the tool and send the result back
weather_result = get_weather(args["location"]) # your function
follow_up = client.chat.completions.create(
model="claude-sonnet-4-20250514",
messages=[
{"role": "user", "content": "What's the weather like in London?"},
message, # assistant message with tool_calls
{
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(weather_result),
},
],
tools=tools,
)
print(follow_up.choices[0].message.content)
Tool Choice
Control whether the model should call tools:
Tool choice options
# Let the model decide (default)
tool_choice="auto"
# Force the model to call a tool
tool_choice="required"
# Force a specific tool
tool_choice={"type": "function", "function": {"name": "get_weather"}}
# Don't use tools (even if provided)
tool_choice="none"
Streaming
SSE streaming works identically to the OpenAI API. All providers produce the same data: {...}\n\n format.
Streaming
stream = client.chat.completions.create(
model="gemini-2.5-flash",
messages=[
{"role": "user", "content": "Write a haiku about deployment automation."}
],
stream=True,
)
for chunk in stream:
content = chunk.choices[0].delta.content
if content:
print(content, end="", flush=True)
print() # newline at end
Embeddings
Generate embeddings using OpenAI-compatible models.
Embeddings
response = client.embeddings.create(
model="text-embedding-3-small",
input="The quick brown fox jumps over the lazy dog",
)
embedding = response.data[0].embedding
print(f"Dimensions: {len(embedding)}")
Embeddings are currently supported for OpenAI models only. Anthropic and Gemini embedding support is planned.
Provider Key Management
Manage provider API keys through the Temps admin panel or API. Keys are encrypted at rest using AES-256-GCM.
Add a provider key via API
# Add an OpenAI key
curl -X POST https://your-temps-instance.com/ai/providers/keys \
-H "Authorization: Bearer your-admin-api-key" \
-H "Content-Type: application/json" \
-d '{
"provider": "openai",
"api_key": "sk-...",
"label": "Production OpenAI Key"
}'
# Add an Anthropic key
curl -X POST https://your-temps-instance.com/ai/providers/keys \
-H "Authorization: Bearer your-admin-api-key" \
-H "Content-Type: application/json" \
-d '{
"provider": "anthropic",
"api_key": "sk-ant-...",
"label": "Production Anthropic Key"
}'
Developers only need a Temps API key. Provider API keys are managed centrally by administrators and never exposed to end users.
Usage & Analytics
Track token usage and costs per user, model, and provider through the Temps dashboard or API.
Query usage
# Get usage summary for a project
curl https://your-temps-instance.com/ai/usage/summary?period=7d \
-H "Authorization: Bearer your-admin-api-key"
Response:
{
"total_requests": 1523,
"total_tokens": 2847392,
"prompt_tokens": 1923847,
"completion_tokens": 923545,
"by_model": [
{"model": "gpt-4o", "requests": 842, "total_tokens": 1523000},
{"model": "claude-sonnet-4-20250514", "requests": 481, "total_tokens": 924392},
{"model": "gemini-2.5-pro", "requests": 200, "total_tokens": 400000}
]
}