Resources & Scaling

Every container deployed by Temps runs with configurable CPU and memory limits. You control how much of your server's resources each application gets, and how many instances (replicas) run per environment.


Resource model

Each environment has a deployment configuration with four resource fields:

FieldUnitDefaultPurpose
cpu_requestMillicoresNot setMinimum CPU guaranteed
cpu_limitMillicoresNot setMaximum CPU allowed
memory_requestMBNot setMinimum memory guaranteed
memory_limitMBNot setMaximum memory allowed
replicasCount1Number of container instances
exposed_portPort number3000The port your application listens on

When resource limits are not set, Docker allocates resources dynamically — the container can use whatever is available on the server. Setting explicit limits prevents one application from starving others.


CPU

CPU is measured in millicores. 1000 millicores = 1 CPU core.

MillicoresCPU coresTypical use case
1000.1Static sites, minimal APIs
2500.25Low-traffic web applications
5000.5Standard web applications
10001.0Medium traffic, server-side rendering
20002.0High traffic, CPU-intensive processing

Request vs Limit:

  • cpu_request — The minimum CPU guaranteed to the container. Docker reserves this capacity.
  • cpu_limit — The maximum CPU the container can use. If the container tries to exceed this, it is throttled (not killed).

When the server has spare capacity, a container can burst above its request up to its limit. When the server is under load, each container is guaranteed at least its requested amount.


Memory

Memory is measured in megabytes (MB).

MBTypical use case
128Static file servers
256Small APIs, microservices
512Standard web applications
1024Server-side rendering, large datasets
2048+Memory-intensive applications

Request vs Limit:

  • memory_request — The minimum memory guaranteed.
  • memory_limit — The maximum memory allowed. If the container exceeds this, Docker kills it (OOM kill) and Temps restarts it.

Replicas

Replicas are the number of container instances running for a given environment. The reverse proxy load-balances requests across all healthy replicas.

ReplicasUse case
1Development, staging, low-traffic production
2-3Standard production with some redundancy
5+High-traffic applications

What replicas provide

  • Higher throughput — More containers handle more concurrent requests
  • Fault tolerance — If one replica crashes, others continue serving traffic while it restarts

What replicas require

  • Stateless application design — Each replica has its own memory space. Session data, caches, and temporary files are not shared. Use Redis or a database for shared state.
  • Sufficient server resources — Each replica consumes its own CPU and memory allocation. Three replicas with 512 MB each need 1.5 GB total.

Configuration hierarchy

Resource settings are configured at two levels:

  1. Project level — Default deployment config that applies to new environments
  2. Environment level — Per-environment overrides that take precedence
SettingProject defaultProduction overrideStaging
cpu_limit10002000(uses project default: 1000)
memory_limit5121024(uses project default: 512)
replicas13(uses project default: 1)

This lets you allocate more resources to production while keeping staging lightweight, without configuring every setting on every environment.

Port resolution

The exposed port determines which port Temps routes traffic to inside the container. It is resolved in priority order:

  1. EXPOSE directive in your Dockerfile (highest priority)
  2. exposed_port on the environment
  3. exposed_port on the project
  4. Default: 3000

Monitoring resource usage

The monitoring dashboard shows real-time resource consumption for each container:

  • CPU usage (%) — Current CPU usage as a percentage of the limit
  • Memory usage (bytes and %) — Current memory consumption and percentage of the limit
  • Network traffic (bytes/sec) — Inbound and outbound data transfer rates

Metrics are delivered in real-time via Server-Sent Events (SSE) so the dashboard updates continuously without polling.

Access metrics programmatically:

curl "https://your-temps-instance/api/projects/{id}/environments/{id}/containers/{id}/metrics" \
  -H "Authorization: Bearer YOUR_TOKEN"

Scaling strategies

Vertical scaling (more resources per container)

Increase CPU and memory limits when:

  • CPU usage is consistently above 80%
  • Memory usage is approaching the limit
  • Response times are increasing due to resource contention

This is the simplest approach and works until you hit your server's physical limits.

Horizontal scaling (more replicas)

Increase the replica count when:

  • Single-container throughput is maxed out despite adequate CPU/memory
  • You need fault tolerance (at least 2 replicas for redundancy)
  • Your application handles many concurrent requests

Requirements:

  • Your application must be stateless (use Redis for sessions, database for persistence)
  • All replicas share the same linked services and environment variables

Scaling the server

If both vertical and horizontal scaling within the current server are insufficient:

  • Move to a larger VPS (more CPU cores, more RAM)
  • Move the PostgreSQL database to a dedicated server or managed service
  • Use the Temps backup system to migrate: back up on the old server, restore on the new one

Was this page helpful?