Resources & Scaling

Every container deployed by Temps runs with configurable CPU and memory limits. You control how much of your server's resources each application gets, and how many instances (replicas) run per environment.

Resource model

Each environment has a deployment configuration with four resource fields:

Field	Unit	Default	Purpose
`cpu_request`	Millicores	Not set	Minimum CPU guaranteed
`cpu_limit`	Millicores	Not set	Maximum CPU allowed
`memory_request`	MB	Not set	Minimum memory guaranteed
`memory_limit`	MB	Not set	Maximum memory allowed
`replicas`	Count	1	Number of container instances
`exposed_port`	Port number	3000	The port your application listens on

When resource limits are not set, Docker allocates resources dynamically — the container can use whatever is available on the server. Setting explicit limits prevents one application from starving others.

CPU

CPU is measured in millicores. 1000 millicores = 1 CPU core.

Millicores	CPU cores	Typical use case
100	0.1	Static sites, minimal APIs
250	0.25	Low-traffic web applications
500	0.5	Standard web applications
1000	1.0	Medium traffic, server-side rendering
2000	2.0	High traffic, CPU-intensive processing

Request vs Limit:

cpu_request — The minimum CPU guaranteed to the container. Docker reserves this capacity.
cpu_limit — The maximum CPU the container can use. If the container tries to exceed this, it is throttled (not killed).

When the server has spare capacity, a container can burst above its request up to its limit. When the server is under load, each container is guaranteed at least its requested amount.

Memory

Memory is measured in megabytes (MB).

MB	Typical use case
128	Static file servers
256	Small APIs, microservices
512	Standard web applications
1024	Server-side rendering, large datasets
2048+	Memory-intensive applications

Request vs Limit:

memory_request — The minimum memory guaranteed.
memory_limit — The maximum memory allowed. If the container exceeds this, Docker kills it (OOM kill) and Temps restarts it.

OOM kills are the most common cause of container restarts. If your application is being killed, check its memory usage in the monitoring dashboard and increase the memory limit. Memory leaks (usage growing continuously over time) should be fixed in application code.

Replicas

Replicas are the number of container instances running for a given environment. The reverse proxy load-balances requests across all healthy replicas.

Replicas	Use case
1	Development, staging, low-traffic production
2-3	Standard production with some redundancy
5+	High-traffic applications

What replicas provide

Higher throughput — More containers handle more concurrent requests
Fault tolerance — If one replica crashes, others continue serving traffic while it restarts
Multi-node distribution — When worker nodes are configured, replicas are distributed across available nodes using round-robin scheduling

What replicas require

Stateless application design — Each replica has its own memory space. Session data, caches, and temporary files are not shared. Use Redis or a database for shared state.
Sufficient server resources — Each replica consumes its own CPU and memory allocation. Three replicas with 512 MB each need 1.5 GB total.

Configuration hierarchy

Resource settings are configured at two levels:

Project level — Default deployment config that applies to new environments
Environment level — Per-environment overrides that take precedence

Setting	Project default	Production override	Staging
`cpu_limit`	1000	2000	(uses project default: 1000)
`memory_limit`	512	1024	(uses project default: 512)
`replicas`	1	3	(uses project default: 1)

This lets you allocate more resources to production while keeping staging lightweight, without configuring every setting on every environment.

Port resolution

The exposed port determines which port Temps routes traffic to inside the container. It is resolved in priority order:

EXPOSE directive in your Dockerfile (highest priority)
exposed_port on the environment
exposed_port on the project
Default: 3000

Monitoring resource usage

The monitoring dashboard shows real-time resource consumption for each container:

CPU usage (%) — Current CPU usage as a percentage of the limit
Memory usage (bytes and %) — Current memory consumption and percentage of the limit
Network traffic (bytes/sec) — Inbound and outbound data transfer rates

Metrics are delivered in real-time via Server-Sent Events (SSE) so the dashboard updates continuously without polling.

Access metrics programmatically:

curl "https://your-temps-instance/api/projects/{id}/environments/{id}/containers/{id}/metrics" \
  -H "Authorization: Bearer YOUR_TOKEN"

Scaling strategies

Vertical scaling (more resources per container)

Increase CPU and memory limits when:

CPU usage is consistently above 80%
Memory usage is approaching the limit
Response times are increasing due to resource contention

This is the simplest approach and works until you hit your server's physical limits.

Horizontal scaling (more replicas)

Increase the replica count when:

Single-container throughput is maxed out despite adequate CPU/memory
You need fault tolerance (at least 2 replicas for redundancy)
Your application handles many concurrent requests

Requirements:

Your application must be stateless (use Redis for sessions, database for persistence)
All replicas share the same linked services and environment variables

Scaling the server

If both vertical and horizontal scaling within the current server are insufficient:

Move to a larger VPS (more CPU cores, more RAM)
Move the PostgreSQL database to a dedicated server or managed service
Use the Temps backup system to migrate: back up on the old server, restore on the new one

Temps does not currently support auto-scaling. Resource allocation and replica count are manual settings. Monitor your metrics and adjust proactively as traffic patterns change. For distributing containers across multiple servers, see multi-node deployment.

Picking a starting server

Temps is designed for solo developers and small teams running 1–10 applications on a single VPS. The right server size depends on your apps' memory footprints, traffic, and how frequently you build — your metrics under Project → Metrics are the authoritative guide once you're running.

A few grounded starting points from the docs:

Small servers (2 GB RAM) — reduce TEMPS_DB_MAX_CONNECTIONS to 20–30 to avoid saturating the pool. See Environment Variables.
Database on the same server competes with containers for RAM. Moving PostgreSQL to a dedicated server or managed service is the most effective way to free headroom for more apps.
Outgrowing a single server — add worker nodes to distribute containers, or move to a larger machine. See Multi-Node and Upgrade Temps.

Watch CPU throttling and OOM restarts in metrics — those are the two signals that tell you it's time to resize before users notice.

Resources & Scaling

Ask anything