AI Gateway

OpenAI-compatible AI gateway built into Temps. Use any OpenAI SDK to call OpenAI, Anthropic, Gemini, xAI, Mistral, and DeepSeek — just change the base URL.


Overview

The AI Gateway provides a unified, OpenAI-compatible API for all major LLM providers. Your existing OpenAI SDK code works unchanged — just point it at your Temps instance.

What's Included

  • OpenAI API wire-compatible endpoints
  • Multi-provider routing (OpenAI, Anthropic, Gemini, xAI, Mistral, DeepSeek)
  • Tool calling / function calling across all providers
  • Vision / image attachments (base64 and URL)
  • SSE streaming with consistent format
  • Centralized API key management
  • Per-user usage tracking and analytics
  • JSON mode / structured output

Why It Matters

  • Zero code changes — works with any OpenAI SDK
  • Centralized key management (devs never see API keys)
  • Switch providers without changing client code
  • Built-in usage analytics and cost tracking
  • Self-hosted — your data stays on your infrastructure
  • No per-request fees beyond provider costs
  • Transparent translation for Anthropic and Gemini

Key Features

  • Name
    Drop-In Replacement
    Description

    100% OpenAI SDK compatible. Works with Python, Node.js, Go, Rust, and any other OpenAI client library.

  • Name
    Transparent Translation
    Description

    Automatically translates requests/responses for Anthropic Messages API and Gemini generateContent API.

  • Name
    Vision Support
    Description

    Send images as base64 data URIs or HTTP URLs. Translated to each provider's native format automatically.

  • Name
    Tool Calling
    Description

    OpenAI-format tools work across all providers. Function definitions, tool_choice, and tool results are translated transparently.

  • Name
    Streaming
    Description

    Server-Sent Events (SSE) streaming with consistent data: {...}\n\n format across all providers.

  • Name
    Usage Analytics
    Description

    Track token usage, costs, and request counts per user, model, and provider.


Quick Start

Point any OpenAI SDK at your Temps instance. The only changes are base_url and api_key.

Setup

from openai import OpenAI

client = OpenAI(
    base_url="https://your-temps-instance.com/ai/v1",
    api_key="your-temps-api-key",
)

response = client.chat.completions.create(
    model="gpt-4o",  # or "claude-sonnet-4-20250514", "gemini-2.5-pro", etc.
    messages=[
        {"role": "user", "content": "Hello!"}
    ],
)

print(response.choices[0].message.content)

Supported Providers

The gateway routes requests to the correct provider based on the model name:

Model PrefixProviderExamples
gpt-*, o1-*, o3-*OpenAIgpt-4o, gpt-4o-mini, o3-mini
claude-*Anthropicclaude-sonnet-4-20250514, claude-haiku-4-5-20251001
gemini-*Google Geminigemini-2.5-pro, gemini-2.5-flash
grok-*xAIgrok-3, grok-3-mini
mistral-*, codestral-*Mistralmistral-large-latest, codestral-latest
deepseek-*DeepSeekdeepseek-chat, deepseek-reasoner

Chat Completions

Standard chat completions work identically to the OpenAI API.

Multi-turn conversation

response = client.chat.completions.create(
    model="claude-sonnet-4-20250514",
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write a Python function to calculate fibonacci numbers."},
        {"role": "assistant", "content": "Here's a fibonacci function:\n\n```python\ndef fib(n):\n    if n <= 1:\n        return n\n    return fib(n-1) + fib(n-2)\n```"},
        {"role": "user", "content": "Can you make it iterative for better performance?"},
    ],
    temperature=0.7,
    max_tokens=500,
)

Vision & Image Attachments

Send images to vision-capable models. The gateway translates image formats automatically for each provider.

Base64 Image

Send base64 image

import base64

# Read image and encode to base64
with open("screenshot.png", "rb") as f:
    image_data = base64.standard_b64encode(f.read()).decode("utf-8")

response = client.chat.completions.create(
    model="gpt-4o",  # or claude-sonnet-4-20250514, gemini-2.5-pro
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image? Describe it in detail."},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/png;base64,{image_data}"
                    },
                },
            ],
        }
    ],
    max_tokens=1000,
)

print(response.choices[0].message.content)

Image from URL

Image URL

response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe what you see in this image."},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/photo.jpg"
                    },
                },
            ],
        }
    ],
)

Multiple Images

Multiple images

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Compare these two screenshots. What changed?"},
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/png;base64,{before_image}"},
                },
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/png;base64,{after_image}"},
                },
            ],
        }
    ],
)

Tool Calling (Function Calling)

Define tools using the OpenAI format. The gateway translates them to each provider's native format.

Tool calling

import json

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name, e.g. 'London'"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit"
                    }
                },
                "required": ["location"]
            }
        }
    }
]

# Step 1: Send message with tools
response = client.chat.completions.create(
    model="claude-sonnet-4-20250514",  # works with any provider
    messages=[
        {"role": "user", "content": "What's the weather like in London?"}
    ],
    tools=tools,
)

message = response.choices[0].message

# Step 2: Check if the model wants to call a tool
if message.tool_calls:
    tool_call = message.tool_calls[0]
    args = json.loads(tool_call.function.arguments)
    print(f"Model wants to call: {tool_call.function.name}({args})")

    # Step 3: Execute the tool and send the result back
    weather_result = get_weather(args["location"])  # your function

    follow_up = client.chat.completions.create(
        model="claude-sonnet-4-20250514",
        messages=[
            {"role": "user", "content": "What's the weather like in London?"},
            message,  # assistant message with tool_calls
            {
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(weather_result),
            },
        ],
        tools=tools,
    )

    print(follow_up.choices[0].message.content)

Tool Choice

Control whether the model should call tools:

Tool choice options

# Let the model decide (default)
tool_choice="auto"

# Force the model to call a tool
tool_choice="required"

# Force a specific tool
tool_choice={"type": "function", "function": {"name": "get_weather"}}

# Don't use tools (even if provided)
tool_choice="none"

Streaming

SSE streaming works identically to the OpenAI API. All providers produce the same data: {...}\n\n format.

Streaming

stream = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[
        {"role": "user", "content": "Write a haiku about deployment automation."}
    ],
    stream=True,
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)

print()  # newline at end

Embeddings

Generate embeddings using OpenAI-compatible models.

Embeddings

response = client.embeddings.create(
    model="text-embedding-3-small",
    input="The quick brown fox jumps over the lazy dog",
)

embedding = response.data[0].embedding
print(f"Dimensions: {len(embedding)}")

Provider Key Management

Manage provider API keys through the Temps admin panel or API. Keys are encrypted at rest using AES-256-GCM.

Add a provider key via API

# Add an OpenAI key
curl -X POST https://your-temps-instance.com/ai/providers/keys \
  -H "Authorization: Bearer your-admin-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "openai",
    "api_key": "sk-...",
    "label": "Production OpenAI Key"
  }'

# Add an Anthropic key
curl -X POST https://your-temps-instance.com/ai/providers/keys \
  -H "Authorization: Bearer your-admin-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "anthropic",
    "api_key": "sk-ant-...",
    "label": "Production Anthropic Key"
  }'

Usage & Analytics

Track token usage and costs per user, model, and provider through the Temps dashboard or API.

Query usage

# Get usage summary for a project
curl https://your-temps-instance.com/ai/usage/summary?period=7d \
  -H "Authorization: Bearer your-admin-api-key"

Response:

{
  "total_requests": 1523,
  "total_tokens": 2847392,
  "prompt_tokens": 1923847,
  "completion_tokens": 923545,
  "by_model": [
    {"model": "gpt-4o", "requests": 842, "total_tokens": 1523000},
    {"model": "claude-sonnet-4-20250514", "requests": 481, "total_tokens": 924392},
    {"model": "gemini-2.5-pro", "requests": 200, "total_tokens": 400000}
  ]
}

Was this page helpful?