Multi-Node Deployment
By default Temps runs everything on a single server. Multi-node mode lets you add worker nodes — additional servers that receive container deployments from the control plane — so you can distribute workloads across multiple machines.
Overview
Multi-node is useful when you:
- Outgrow a single server — you need more CPU, memory, or disk than one machine provides
- Want geographic distribution — deploy containers closer to your users by placing workers in different regions
- Need workload isolation — run production and staging on separate physical machines
When no worker nodes are registered, Temps behaves exactly as before — all containers run locally on the control plane server.
Architecture
Control plane (your Temps server)
- Runs the API, dashboard, proxy, and database
- Accepts
temps joinregistrations from workers - Schedules container deployments across nodes
- Routes traffic to containers via private addresses
Worker nodes
- Run the
temps agentHTTP server on port 3100 - Accept container lifecycle commands from the control plane
- Send heartbeats every 30 seconds with capacity metrics
- Have Docker installed for running containers
Traffic from the internet still enters through the control plane's reverse proxy (Pingora). The proxy routes requests to the correct container — whether it runs locally or on a remote worker — using the node's private address.
Prerequisites
On each worker node:
- Docker installed and running
- The
tempsbinary available (same version as the control plane) - Network connectivity to the control plane (see direct vs WireGuard modes below)
- Port 3100 accessible from the control plane (or a WireGuard tunnel)
On the control plane:
- Temps server running with
temps serve - The
/api/internal/nodes/registerendpoint reachable from workers - A join token generated from Settings > Worker Nodes (required for secure node registration)
Adding a worker node (direct mode)
Add a worker node (direct mode)
- 1
In the Temps dashboard go to Settings > Worker Nodes and click Generate Join Token, then copy the token (it is shown once; the control plane only stores a SHA-256 hash).
Checkpoint: Save the token somewhere secure before leaving the page -- you cannot retrieve it again afterward.
- 2
On the worker machine, install Temps: curl -fsSL https://temps.sh/install.sh | bash
- 3
Join the cluster with the control plane URL and token, passing --private-address <worker-ip> (e.g. 10.0.0.5) to enable direct mode so traffic routes straight to the worker's private address.
- 4
Optionally add --name worker-eu-1, --labels region=eu-west,gpu=true, and --agent-address 0.0.0.0:3100 to label and configure the node (the default listen address is 127.0.0.1:3100; pass 0.0.0.0:3100 to listen on all interfaces).
- 5
After joining, the agent config is saved to ~/.temps/agent.json. Start the agent with temps agent.
Checkpoint: Open Settings > Worker Nodes and confirm the new node appears with status active and a recent heartbeat (within the last 90 seconds).
Direct mode is the simplest option when nodes can reach each other over a private network (e.g., same VPC, same data center, or a VPN).
1. Generate a join token:
Go to Settings > Worker Nodes in the Temps dashboard and click Generate Join Token. The token is shown once — copy and save it securely. The control plane only stores a SHA-256 hash.
2. Install Temps on the worker machine:
curl -fsSL https://temps.sh/install.sh | bash
3. Join the cluster:
temps join <control-plane-url> <join-token> --private-address <worker-ip>
- Name
control-plane-url- Type
- string
- Description
The URL of your Temps control plane, e.g.
https://temps.example.com.
- Name
join-token- Type
- string
- Description
The join token generated in step 1. Required when the control plane has a token configured. The token is hashed (SHA-256) before storage — the plaintext is never persisted on the control plane.
- Name
--private-address- Type
- string
- Description
The IP address the control plane should use to reach this worker's containers. Typically a private/internal IP like
10.0.0.5or192.168.1.50.
The --private-address flag enables direct mode — no WireGuard tunnel is created. The control plane routes traffic directly to the worker's private address.
Optional flags:
temps join <url> <token> --private-address 10.0.0.5 \
--name worker-eu-1 \
--labels region=eu-west,gpu=true \
--agent-address 0.0.0.0:3100
--name— A friendly name for this node (defaults to the hostname)--labels— Key-value pairs for scheduling hints (e.g.,region=us-east,tier=production)--agent-address— The address the agent server listens on (default:127.0.0.1:3100)
After joining, the node is registered and the agent config is saved to ~/.temps/agent.json. Start the agent separately:
temps agent
The agent reads its config from ~/.temps/agent.json and begins sending heartbeats to the control plane every 30 seconds.
Adding a worker node (WireGuard mode)
Add a worker node (WireGuard mode)
- 1
Generate a join token in Settings > Worker Nodes and copy it.
- 2
On the worker, run temps join <control-plane-url> <join-token> without --private-address; this generates a WireGuard keypair, contacts the relay at api.temps.sh for key exchange, brings up the wg0 interface on a 10.100.0.x address, and registers using that address. WireGuard runs in-process (boringtun) -- no wireguard-tools package needed, just root or CAP_NET_ADMIN to create the wg0 interface.
- 3
Start the agent with temps agent. The config was saved to ~/.temps/agent.json during join.
Checkpoint: Confirm the node shows status active in Settings > Worker Nodes with its 10.100.0.x WireGuard address.
WireGuard mode creates an encrypted tunnel between the worker and the control plane. Use this when nodes are on different networks without direct connectivity.
temps join <control-plane-url> <join-token>
Without --private-address, the join command:
- Generates a WireGuard keypair
- Contacts the relay at
api.temps.shfor key exchange - Sets up a WireGuard tunnel (
wg0interface) with a10.100.0.xaddress - Registers with the control plane using the WireGuard address
- Saves the agent config to
~/.temps/agent.json
Then start the agent:
temps agent
WireGuard runs in-process via an embedded userspace implementation (boringtun) — no wireguard-tools package, wg, or wg-quick binaries are required on either machine. The only requirement is permission to create the wg0 interface: run temps join as root or grant the binary CAP_NET_ADMIN.
Running the agent
Run the worker agent
- 1
Make sure temps join has already run on this machine so ~/.temps/agent.json exists with the node ID, control plane URL, and bearer token.
- 2
Run temps agent. To override the saved config, set TEMPS_AGENT_ADDRESS/--listen-address (default 127.0.0.1:3100), TEMPS_AGENT_TOKEN/--token, or TEMPS_CONTROL_PLANE_URL/--control-plane-url.
- 3
The agent sends a heartbeat every 30 seconds; if heartbeats stop the control plane marks the node offline after 90 seconds and excludes it from scheduling.
Checkpoint: In Settings > Worker Nodes confirm the node's last heartbeat updates and status stays active while the agent is running.
After temps join completes, run the agent server:
temps agent
The agent loads its configuration from ~/.temps/agent.json (saved by temps join). CLI flags and environment variables override the saved config:
| Variable / Flag | Default | Description |
|---|---|---|
TEMPS_AGENT_ADDRESS / --listen-address | 127.0.0.1:3100 | Address the agent listens on |
TEMPS_AGENT_TOKEN / --token | — | Bearer token for authenticating control plane requests |
TEMPS_NODE_NAME / --node-name | — | Name for this node |
TEMPS_CONTROL_PLANE_URL / --control-plane-url | — | URL of the control plane |
TEMPS_NODE_ID / --node-id | — | Node ID assigned during registration |
If no ~/.temps/agent.json exists and required fields are missing, the agent exits with an error suggesting you run temps join first.
The agent sends a heartbeat to the control plane every 30 seconds. If heartbeats stop (e.g., the agent is stopped or the machine goes down), the control plane marks the node as offline after 90 seconds and excludes it from scheduling.
The agent exposes these endpoints (all authenticated with the bearer token):
| Endpoint | Method | Description |
|---|---|---|
/agent/containers/deploy | POST | Deploy a container |
/agent/containers/{id}/stop | POST | Stop a container |
/agent/containers/{id} | DELETE | Remove a container |
/agent/containers/{id}/logs | GET | Stream container logs |
/agent/containers/{id}/info | GET | Get container info |
/agent/images/{name}/exists | GET | Check if an image exists |
/agent/health | GET | Health check with system metrics |
Verifying nodes
From the dashboard:
Go to Settings > Worker Nodes to see all registered nodes, their status, and last heartbeat time.
From the API:
List all nodes
curl https://your-temps-instance/api/internal/nodes
Response
{
"nodes": [
{
"id": 1,
"name": "worker-eu-1",
"address": "https://10.0.0.5:3100",
"private_address": "10.0.0.5",
"role": "worker",
"status": "active",
"labels": { "region": "eu-west" },
"last_heartbeat": "2026-03-04T12:00:00Z"
}
],
"total": 1
}
Get a specific node
curl https://your-temps-instance/api/internal/nodes/1
A node is considered active if it has sent a heartbeat within the last 90 seconds. Nodes that miss heartbeats are automatically marked offline and excluded from scheduling.
How scheduling works
When you deploy an application with multiple replicas, the node scheduler distributes containers across available nodes using round-robin scheduling:
- The scheduler queries all nodes with
status = "active"and a heartbeat within the last 90 seconds - If no active worker nodes exist, all replicas run locally on the control plane
- If active workers exist, replicas are distributed evenly across all nodes (including local) in round-robin order
- Each container records which node it was assigned to in the database
The deployment job creates a RemoteNodeDeployer for each remote assignment, which communicates with the worker's agent API to deploy, stop, and manage containers.
The scheduler currently uses simple round-robin. Label-based scheduling (e.g., "only deploy to nodes with gpu=true") is planned but not yet implemented — labels are stored but not used for filtering.
Current limitations
Multi-node is functional but still evolving. Known limitations:
- No label-based scheduling — labels are stored on nodes but the scheduler does not filter by them yet. All active nodes receive deployments equally.
- No remote image transfer — the worker node must be able to pull the Docker image independently. If you are using a local registry on the control plane, the worker needs network access to it.
- No log streaming from remote nodes — deployment logs from containers on worker nodes are not yet streamed back to the control plane dashboard in real-time.
- No remote node removal from UI — nodes can be removed via the API (
DELETEis not yet exposed) or by stopping the agent and waiting for the heartbeat to expire. - Round-robin only — no resource-aware scheduling, no affinity/anti-affinity rules, no bin-packing.