Resource Management & Scaling
Temps gives you fine-grained control over CPU, memory, and replica allocation for each environment. Scale your applications vertically (more resources per instance) or horizontally (more instances) with zero downtime.
Self-Hosted Advantage: Since Temps runs on your infrastructure, you control resource allocation without paying per-GB or per-vCPU pricing. Allocate as much as your hardware allows.
Resource Types
Temps manages three primary resources:
- CPU — Measured in millicores (1000m = 1 CPU core)
- Memory — RAM allocation in MB or GB
- Replicas — Number of running instances for horizontal scaling
Resources can be configured globally (all environments) or per-environment (staging vs production).
Quick resource configuration
# Set resources via CLI
temps resources set \
--project my-app \
--environment production \
--cpu 1000 \
--memory 512 \
--replicas 3
# View current allocation
temps resources get \
--project my-app \
--environment production
CPU Allocation
Understanding Millicores
CPU is measured in millicores (thousandths of a CPU core):
| Millicores | CPU Cores | Use Case |
|---|---|---|
| 100m | 0.1 cores | Minimal workloads, static sites |
| 250m | 0.25 cores | Small APIs, low traffic |
| 500m | 0.5 cores | Standard web applications |
| 1000m | 1 core | Medium traffic applications |
| 2000m | 2 cores | High traffic, CPU-intensive tasks |
| 4000m+ | 4+ cores | Very high traffic, data processing |
temps resources set --project my-app --environment production --cpu 2000
Setting CPU Limits
Configure CPU allocation per environment:
Via Dashboard:
- Navigate to Project Settings → Resources
- Select environment (production/staging)
- Set CPU slider (100m - 16000m)
- Click Save
Via CLI:
temps resources set \
--project my-app \
--environment production \
--cpu 2000 # 2 CPU cores
CPU Behavior
Requests vs Limits:
- Request: Guaranteed CPU allocation
- Limit: Maximum CPU that can be used
Temps sets both to the same value by default for predictable performance.
Throttling: If your app exceeds the CPU limit, it will be throttled (not killed), which may cause slower response times.
Monitoring: View real-time CPU usage in the dashboard or via CLI:
temps metrics cpu --project my-app --tail
CPU Recommendations
- Name
Static Sites (Next.js, React)- Description
Recommended: 250-500m
Static sites have minimal CPU requirements since most work happens at build time.
- Name
API Servers (Express, FastAPI)- Description
Recommended: 500-1000m
APIs need more CPU for request processing, especially with database queries.
- Name
Server-Side Rendering- Description
Recommended: 1000-2000m
SSR applications (Next.js App Router) need more CPU for rendering on each request.
- Name
Background Jobs- Description
Recommended: 1000-4000m
CPU-intensive tasks like image processing, video encoding, or data analysis need higher allocation.
Memory Allocation
Memory Limits
Memory is measured in megabytes (MB) or gigabytes (GB):
| Memory | Use Case |
|---|---|
| 128 MB | Minimal static sites, simple APIs |
| 256 MB | Small web applications |
| 512 MB | Standard web applications |
| 1 GB | Medium applications with caching |
| 2 GB | Large applications, in-memory processing |
| 4 GB+ | Very large datasets, heavy caching |
Setting Memory Limits
Via Dashboard:
- Navigate to Project Settings → Resources
- Select environment
- Set Memory slider (128MB - 32GB)
- Click Save
Via CLI:
temps resources set \
--project my-app \
--environment production \
--memory 1024 # 1 GB
Memory Behavior
Out of Memory (OOM): If your application exceeds its memory limit, it will be killed and automatically restarted. This causes brief downtime for that replica.
Memory Leaks: Monitor memory usage over time to detect leaks:
temps metrics memory \
--project my-app \
--period 24h
Swap: Temps does not enable swap by default. Memory limits are hard limits.
Memory Recommendations
- Name
Next.js Static Export- Description
Recommended: 256-512 MB
Pre-rendered pages have minimal memory needs.
- Name
Next.js SSR/ISR- Description
Recommended: 512 MB - 1 GB
Server-side rendering and API routes need more memory.
- Name
Node.js API (Express)- Description
Recommended: 512 MB - 1 GB
Depends on request complexity and middleware usage.
- Name
Python/Django- Description
Recommended: 512 MB - 2 GB
Python applications typically use more memory than Node.js.
- Name
With Redis/Caching- Description
Recommended: +256-512 MB
Add extra memory if using in-memory caching.
Horizontal Scaling (Replicas)
Understanding Replicas
Replicas are multiple instances of your application running simultaneously. Incoming traffic is load-balanced across all replicas.
Benefits:
- High Availability: If one replica fails, others continue serving traffic
- Load Distribution: Requests spread across multiple instances
- Zero-Downtime Deploys: Old replicas stay running while new ones start
- Increased Throughput: Handle more concurrent requests
When to Scale Horizontally:
- High traffic volume
- Need for redundancy
- CPU/memory usage is high but per-request processing is efficient
Scale replicas
# Set replica count
temps resources set \
--project my-app \
--environment production \
--replicas 5
Replica Configuration
Configure replicas
# Fixed replica count
temps resources set --replicas 3
# View current replicas
temps resources get | grep replicas
# Scale up/down
temps scale --replicas 5
Replica Best Practices
- Name
Minimum 2 Replicas for Production- Description
Always run at least 2 replicas in production for high availability. If one fails, the other continues serving traffic.
- Name
Odd Numbers for Quorum- Description
If using consensus algorithms (Raft, etc.), use odd replica counts (3, 5, 7) to avoid split-brain scenarios.
- Name
Stateless Applications- Description
Replicas work best with stateless applications. Use external services (Redis, PostgreSQL) for state management instead of in-memory state.
- Name
Session Affinity- Description
If you need sticky sessions, enable session affinity:
temps lb set --session-affinity cookie --project my-app
Resource Limits Per Environment
Configure different resource allocations for each environment:
Typical Setup
Production:
- Higher CPU and memory for performance
- Multiple replicas for high availability
Staging:
- Lower resources to save costs
- Single replica (sufficient for testing)
temps.json
{
"resources": {
"production": {
"cpu": 2000,
"memory": 2048,
"replicas": 3
},
"staging": {
"cpu": 500,
"memory": 512,
"replicas": 1
}
}
}
Monitoring Resource Usage
Real-Time Metrics
View current resource consumption:
Resource monitoring
# Real-time CPU and memory
temps metrics live --project my-app --environment production
# Historical metrics (last 24 hours)
temps metrics history --project my-app --period 24h
# Specific metric
temps metrics cpu --project my-app --tail
temps metrics memory --project my-app --tail
Dashboard Metrics
The Temps dashboard shows:
- Current CPU %: Real-time CPU usage per replica
- Current Memory %: Real-time memory usage per replica
- Replica Health: Status of each replica (healthy/unhealthy)
- Request Rate: Requests per second across all replicas
- Response Time: Average response time (p50, p95, p99)
Alerts
Set up alerts for high resource usage:
Configure alerts
# Alert when CPU > 90% for 5 minutes
temps alert create \
--metric cpu \
--threshold 90 \
--duration 5m \
--action email \
--recipient team@company.com
# Alert when memory > 85%
temps alert create \
--metric memory \
--threshold 85 \
--duration 2m \
--action webhook \
--url https://hooks.slack.com/services/xxx
Cost Optimization
Self-Hosted = No Per-Resource Pricing: Since Temps is self-hosted, you only pay for your infrastructure (EC2, VPS, bare metal), not per-GB or per-vCPU like cloud platforms.
Right-Sizing Resources
- Name
Start Small- Description
Begin with minimal resources (500m CPU, 512 MB memory) and scale up based on actual usage. Over-provisioning wastes infrastructure capacity.
- Name
Monitor for 48 Hours- Description
After deployment, monitor resource usage for 48 hours to identify patterns (peak hours, steady state).
- Name
Adjust Incrementally- Description
Increase resources by 25-50% if you're consistently hitting limits. Don't jump from 512 MB to 4 GB—try 768 MB first.
Staging Environment Savings
Reduce Staging Resources:
- 50% of production CPU
- 50% of production memory
- 1 replica
Staging is for testing, not production traffic. You don't need the same resources.
Optimized staging
{
"resources": {
"production": {
"cpu": 2000,
"memory": 2048,
"replicas": 3
},
"staging": {
"cpu": 1000, // 50% of prod
"memory": 1024, // 50% of prod
"replicas": 1 // Single instance
}
}
}
Troubleshooting
- Name
High CPU Usage (>80%)- Description
Symptoms: Slow response times, request timeouts
Solutions:
- Scale Up CPU: Increase CPU allocation per replica
- Scale Out: Add more replicas to distribute load
- Optimize Code: Profile your application to find CPU-intensive operations
- Enable Caching: Cache expensive computations or database queries
# Quick fix: Add more replicas temps scale --replicas 5 --project my-app
- Name
High Memory Usage (>85%)- Description
Symptoms: Frequent OOM kills, application restarts
Solutions:
- Scale Up Memory: Increase memory allocation
- Fix Memory Leaks: Use memory profiling tools (Node.js:
--inspect, Python:memory_profiler) - Reduce In-Memory Caching: Move caching to Redis
- Optimize Data Structures: Use streams instead of loading entire datasets into memory
# Quick fix: Double memory temps resources set --memory 2048 --project my-app
- Name
Uneven Load Distribution- Description
Symptoms: Some replicas at 90% CPU while others at 20%
Solutions:
- Check Session Affinity: Disable sticky sessions if not needed
- Verify Load Balancer: Ensure round-robin or least-connections algorithm
- Review Long Requests: Long-running requests can "block" a replica
# Verify load balancer config temps lb get --project my-app
Comparison with Other Platforms
| Feature | Temps | Vercel | Netlify | Railway |
|---|---|---|---|---|
| CPU Control | ✅ Configurable (100m - 16000m) | ❌ Fixed per plan | ❌ Fixed per plan | ✅ Configurable |
| Memory Control | ✅ Configurable (128 MB - 32 GB) | ❌ Fixed (1 GB) | ❌ Fixed (1 GB) | ✅ Configurable |
| Manual Scaling | ✅ Instant | ⚠️ Plan upgrade | ⚠️ Plan upgrade | ✅ Instant |
| Per-Environment | ✅ Yes | ❌ No | ❌ No | ✅ Yes |
| Replica Control | ✅ 1-100+ | ❌ Automatic | ❌ Automatic | ✅ 1-20 |
| Cost | ❌ Your infrastructure | ✅ $20-$500+/month | ✅ $19-$500+/month | ✅ Pay per resource |
Best Practices Summary
- Start with 500m CPU, 512 MB memory, 2 replicas for production
- Monitor for 48 hours before adjusting resources
- Right-size staging to 50% of production resources
- Set up alerts for >80% CPU or >85% memory usage
- Profile your application to identify resource bottlenecks
- Use external services (Redis, PostgreSQL) instead of in-memory state
- Enable health checks to detect unhealthy replicas
- Review metrics weekly to optimize resource allocation
- Document your resource decisions for future reference
Next Steps
- Set up Monitoring to track resource usage and configure health checks
- View Logs to debug performance issues