Resource Management & Scaling

Temps gives you fine-grained control over CPU, memory, and replica allocation for each environment. Scale your applications vertically (more resources per instance) or horizontally (more instances) with zero downtime.

Self-Hosted Advantage: Since Temps runs on your infrastructure, you control resource allocation without paying per-GB or per-vCPU pricing. Allocate as much as your hardware allows.

Resource Types

Temps manages three primary resources:

CPU — Measured in millicores (1000m = 1 CPU core)
Memory — RAM allocation in MB or GB
Replicas — Number of running instances for horizontal scaling

Resources can be configured globally (all environments) or per-environment (staging vs production).

Quick resource configuration

# Set resources via CLI
temps resources set \
  --project my-app \
  --environment production \
  --cpu 1000 \
  --memory 512 \
  --replicas 3

# View current allocation
temps resources get \
  --project my-app \
  --environment production

CPU Allocation

Understanding Millicores

CPU is measured in millicores (thousandths of a CPU core):

Millicores	CPU Cores	Use Case
100m	0.1 cores	Minimal workloads, static sites
250m	0.25 cores	Small APIs, low traffic
500m	0.5 cores	Standard web applications
1000m	1 core	Medium traffic applications
2000m	2 cores	High traffic, CPU-intensive tasks
4000m+	4+ cores	Very high traffic, data processing

temps resources set --project my-app --environment production --cpu 2000

Setting CPU Limits

Configure CPU allocation per environment:

Via Dashboard:

Navigate to Project Settings → Resources
Select environment (production/staging)
Set CPU slider (100m - 16000m)
Click Save

Via CLI:

temps resources set \
  --project my-app \
  --environment production \
  --cpu 2000  # 2 CPU cores

CPU Behavior

Requests vs Limits:

Request: Guaranteed CPU allocation
Limit: Maximum CPU that can be used

Temps sets both to the same value by default for predictable performance.

Throttling: If your app exceeds the CPU limit, it will be throttled (not killed), which may cause slower response times.

Monitoring: View real-time CPU usage in the dashboard or via CLI:

temps metrics cpu --project my-app --tail

CPU Recommendations

Name
Static Sites (Next.js, React)
Description
Recommended: 250-500m
Static sites have minimal CPU requirements since most work happens at build time.
Name
API Servers (Express, FastAPI)
Description
Recommended: 500-1000m
APIs need more CPU for request processing, especially with database queries.
Name
Server-Side Rendering
Description
Recommended: 1000-2000m
SSR applications (Next.js App Router) need more CPU for rendering on each request.
Name
Background Jobs
Description
Recommended: 1000-4000m
CPU-intensive tasks like image processing, video encoding, or data analysis need higher allocation.

Memory Allocation

Memory Limits

Memory is measured in megabytes (MB) or gigabytes (GB):

Memory	Use Case
128 MB	Minimal static sites, simple APIs
256 MB	Small web applications
512 MB	Standard web applications
1 GB	Medium applications with caching
2 GB	Large applications, in-memory processing
4 GB+	Very large datasets, heavy caching

Setting Memory Limits

Via Dashboard:

Navigate to Project Settings → Resources
Select environment
Set Memory slider (128MB - 32GB)
Click Save

Via CLI:

temps resources set \
  --project my-app \
  --environment production \
  --memory 1024  # 1 GB

Memory Behavior

Out of Memory (OOM): If your application exceeds its memory limit, it will be killed and automatically restarted. This causes brief downtime for that replica.

Memory Leaks: Monitor memory usage over time to detect leaks:

temps metrics memory \
  --project my-app \
  --period 24h

Swap: Temps does not enable swap by default. Memory limits are hard limits.

Memory Recommendations

Name
Next.js Static Export
Description
Recommended: 256-512 MB
Pre-rendered pages have minimal memory needs.
Name
Next.js SSR/ISR
Description
Recommended: 512 MB - 1 GB
Server-side rendering and API routes need more memory.
Name
Node.js API (Express)
Description
Recommended: 512 MB - 1 GB
Depends on request complexity and middleware usage.
Name
Python/Django
Description
Recommended: 512 MB - 2 GB
Python applications typically use more memory than Node.js.
Name
With Redis/Caching
Description
Recommended: +256-512 MB
Add extra memory if using in-memory caching.

Horizontal Scaling (Replicas)

Understanding Replicas

Replicas are multiple instances of your application running simultaneously. Incoming traffic is load-balanced across all replicas.

Benefits:

High Availability: If one replica fails, others continue serving traffic
Load Distribution: Requests spread across multiple instances
Zero-Downtime Deploys: Old replicas stay running while new ones start
Increased Throughput: Handle more concurrent requests

When to Scale Horizontally:

High traffic volume
Need for redundancy
CPU/memory usage is high but per-request processing is efficient

Scale replicas

# Set replica count
temps resources set \
  --project my-app \
  --environment production \
  --replicas 5

Replica Configuration

Configure replicas

# Fixed replica count
temps resources set --replicas 3

# View current replicas
temps resources get | grep replicas

# Scale up/down
temps scale --replicas 5

Replica Best Practices

Name
Minimum 2 Replicas for Production
Description
Always run at least 2 replicas in production for high availability. If one fails, the other continues serving traffic.
Name
Odd Numbers for Quorum
Description
If using consensus algorithms (Raft, etc.), use odd replica counts (3, 5, 7) to avoid split-brain scenarios.
Name
Stateless Applications
Description
Replicas work best with stateless applications. Use external services (Redis, PostgreSQL) for state management instead of in-memory state.
Name
Session Affinity
Description
If you need sticky sessions, enable session affinity:
temps lb set --session-affinity cookie --project my-app

Resource Limits Per Environment

Configure different resource allocations for each environment:

Typical Setup

Production:

Higher CPU and memory for performance
Multiple replicas for high availability

Staging:

Lower resources to save costs
Single replica (sufficient for testing)

temps.json

{
  "resources": {
    "production": {
      "cpu": 2000,
      "memory": 2048,
      "replicas": 3
    },
    "staging": {
      "cpu": 500,
      "memory": 512,
      "replicas": 1
    }
  }
}

Monitoring Resource Usage

Real-Time Metrics

View current resource consumption:

Resource monitoring

# Real-time CPU and memory
temps metrics live --project my-app --environment production

# Historical metrics (last 24 hours)
temps metrics history --project my-app --period 24h

# Specific metric
temps metrics cpu --project my-app --tail
temps metrics memory --project my-app --tail

Dashboard Metrics

The Temps dashboard shows:

Current CPU %: Real-time CPU usage per replica
Current Memory %: Real-time memory usage per replica
Replica Health: Status of each replica (healthy/unhealthy)
Request Rate: Requests per second across all replicas
Response Time: Average response time (p50, p95, p99)

Alerts

Set up alerts for high resource usage:

Configure alerts

# Alert when CPU > 90% for 5 minutes
temps alert create \
  --metric cpu \
  --threshold 90 \
  --duration 5m \
  --action email \
  --recipient team@company.com

# Alert when memory > 85%
temps alert create \
  --metric memory \
  --threshold 85 \
  --duration 2m \
  --action webhook \
  --url https://hooks.slack.com/services/xxx

Cost Optimization

Self-Hosted = No Per-Resource Pricing: Since Temps is self-hosted, you only pay for your infrastructure (EC2, VPS, bare metal), not per-GB or per-vCPU like cloud platforms.

Right-Sizing Resources

Name
Start Small
Description
Begin with minimal resources (500m CPU, 512 MB memory) and scale up based on actual usage. Over-provisioning wastes infrastructure capacity.
Name
Monitor for 48 Hours
Description
After deployment, monitor resource usage for 48 hours to identify patterns (peak hours, steady state).
Name
Adjust Incrementally
Description
Increase resources by 25-50% if you're consistently hitting limits. Don't jump from 512 MB to 4 GB—try 768 MB first.

Staging Environment Savings

Reduce Staging Resources:

50% of production CPU
50% of production memory
1 replica

Staging is for testing, not production traffic. You don't need the same resources.

Optimized staging

{
  "resources": {
    "production": {
      "cpu": 2000,
      "memory": 2048,
      "replicas": 3
    },
    "staging": {
      "cpu": 1000,    // 50% of prod
      "memory": 1024, // 50% of prod
      "replicas": 1   // Single instance
    }
  }
}

Troubleshooting

Name
High CPU Usage (>80%)
Description
Symptoms: Slow response times, request timeouts
Solutions:
- Scale Up CPU: Increase CPU allocation per replica
- Scale Out: Add more replicas to distribute load
- Optimize Code: Profile your application to find CPU-intensive operations
- Enable Caching: Cache expensive computations or database queries
# Quick fix: Add more replicas temps scale --replicas 5 --project my-app
Name
High Memory Usage (>85%)
Description
Symptoms: Frequent OOM kills, application restarts
Solutions:
- Scale Up Memory: Increase memory allocation
- Fix Memory Leaks: Use memory profiling tools (Node.js: --inspect, Python: memory_profiler)
- Reduce In-Memory Caching: Move caching to Redis
- Optimize Data Structures: Use streams instead of loading entire datasets into memory
# Quick fix: Double memory temps resources set --memory 2048 --project my-app
Name
Uneven Load Distribution
Description
Symptoms: Some replicas at 90% CPU while others at 20%
Solutions:
- Check Session Affinity: Disable sticky sessions if not needed
- Verify Load Balancer: Ensure round-robin or least-connections algorithm
- Review Long Requests: Long-running requests can "block" a replica
# Verify load balancer config temps lb get --project my-app

Comparison with Other Platforms

Feature	Temps	Vercel	Netlify	Railway
CPU Control	✅ Configurable (100m - 16000m)	❌ Fixed per plan	❌ Fixed per plan	✅ Configurable
Memory Control	✅ Configurable (128 MB - 32 GB)	❌ Fixed (1 GB)	❌ Fixed (1 GB)	✅ Configurable
Manual Scaling	✅ Instant	⚠️ Plan upgrade	⚠️ Plan upgrade	✅ Instant
Per-Environment	✅ Yes	❌ No	❌ No	✅ Yes
Replica Control	✅ 1-100+	❌ Automatic	❌ Automatic	✅ 1-20
Cost	❌ Your infrastructure	✅ $20-$500+/month	✅ $19-$500+/month	✅ Pay per resource

Best Practices Summary

Start with 500m CPU, 512 MB memory, 2 replicas for production
Monitor for 48 hours before adjusting resources
Right-size staging to 50% of production resources
Set up alerts for >80% CPU or >85% memory usage
Profile your application to identify resource bottlenecks
Use external services (Redis, PostgreSQL) instead of in-memory state
Enable health checks to detect unhealthy replicas
Review metrics weekly to optimize resource allocation
Document your resource decisions for future reference

Next Steps

Set up Monitoring to track resource usage and configure health checks
View Logs to debug performance issues