Resource Management & Scaling

Temps gives you fine-grained control over CPU, memory, and replica allocation for each environment. Scale your applications vertically (more resources per instance) or horizontally (more instances) with zero downtime.


Resource Types

Temps manages three primary resources:

  • CPU — Measured in millicores (1000m = 1 CPU core)
  • Memory — RAM allocation in MB or GB
  • Replicas — Number of running instances for horizontal scaling

Resources can be configured globally (all environments) or per-environment (staging vs production).

Quick resource configuration

# Set resources via CLI
temps resources set \
  --project my-app \
  --environment production \
  --cpu 1000 \
  --memory 512 \
  --replicas 3

# View current allocation
temps resources get \
  --project my-app \
  --environment production

CPU Allocation

Understanding Millicores

CPU is measured in millicores (thousandths of a CPU core):

MillicoresCPU CoresUse Case
100m0.1 coresMinimal workloads, static sites
250m0.25 coresSmall APIs, low traffic
500m0.5 coresStandard web applications
1000m1 coreMedium traffic applications
2000m2 coresHigh traffic, CPU-intensive tasks
4000m+4+ coresVery high traffic, data processing
temps resources set --project my-app --environment production --cpu 2000

Setting CPU Limits

Configure CPU allocation per environment:

Via Dashboard:

  1. Navigate to Project SettingsResources
  2. Select environment (production/staging)
  3. Set CPU slider (100m - 16000m)
  4. Click Save

Via CLI:

temps resources set \
  --project my-app \
  --environment production \
  --cpu 2000  # 2 CPU cores

CPU Behavior

Requests vs Limits:

  • Request: Guaranteed CPU allocation
  • Limit: Maximum CPU that can be used

Temps sets both to the same value by default for predictable performance.

Throttling: If your app exceeds the CPU limit, it will be throttled (not killed), which may cause slower response times.

Monitoring: View real-time CPU usage in the dashboard or via CLI:

temps metrics cpu --project my-app --tail

CPU Recommendations

  • Name
    Static Sites (Next.js, React)
    Description

    Recommended: 250-500m

    Static sites have minimal CPU requirements since most work happens at build time.

  • Name
    API Servers (Express, FastAPI)
    Description

    Recommended: 500-1000m

    APIs need more CPU for request processing, especially with database queries.

  • Name
    Server-Side Rendering
    Description

    Recommended: 1000-2000m

    SSR applications (Next.js App Router) need more CPU for rendering on each request.

  • Name
    Background Jobs
    Description

    Recommended: 1000-4000m

    CPU-intensive tasks like image processing, video encoding, or data analysis need higher allocation.


Memory Allocation

Memory Limits

Memory is measured in megabytes (MB) or gigabytes (GB):

MemoryUse Case
128 MBMinimal static sites, simple APIs
256 MBSmall web applications
512 MBStandard web applications
1 GBMedium applications with caching
2 GBLarge applications, in-memory processing
4 GB+Very large datasets, heavy caching

Setting Memory Limits

Via Dashboard:

  1. Navigate to Project SettingsResources
  2. Select environment
  3. Set Memory slider (128MB - 32GB)
  4. Click Save

Via CLI:

temps resources set \
  --project my-app \
  --environment production \
  --memory 1024  # 1 GB

Memory Behavior

Out of Memory (OOM): If your application exceeds its memory limit, it will be killed and automatically restarted. This causes brief downtime for that replica.

Memory Leaks: Monitor memory usage over time to detect leaks:

temps metrics memory \
  --project my-app \
  --period 24h

Swap: Temps does not enable swap by default. Memory limits are hard limits.

Memory Recommendations

  • Name
    Next.js Static Export
    Description

    Recommended: 256-512 MB

    Pre-rendered pages have minimal memory needs.

  • Name
    Next.js SSR/ISR
    Description

    Recommended: 512 MB - 1 GB

    Server-side rendering and API routes need more memory.

  • Name
    Node.js API (Express)
    Description

    Recommended: 512 MB - 1 GB

    Depends on request complexity and middleware usage.

  • Name
    Python/Django
    Description

    Recommended: 512 MB - 2 GB

    Python applications typically use more memory than Node.js.

  • Name
    With Redis/Caching
    Description

    Recommended: +256-512 MB

    Add extra memory if using in-memory caching.


Horizontal Scaling (Replicas)

Understanding Replicas

Replicas are multiple instances of your application running simultaneously. Incoming traffic is load-balanced across all replicas.

Benefits:

  • High Availability: If one replica fails, others continue serving traffic
  • Load Distribution: Requests spread across multiple instances
  • Zero-Downtime Deploys: Old replicas stay running while new ones start
  • Increased Throughput: Handle more concurrent requests

When to Scale Horizontally:

  • High traffic volume
  • Need for redundancy
  • CPU/memory usage is high but per-request processing is efficient

Scale replicas

# Set replica count
temps resources set \
  --project my-app \
  --environment production \
  --replicas 5

Replica Configuration

Configure replicas

# Fixed replica count
temps resources set --replicas 3

# View current replicas
temps resources get | grep replicas

# Scale up/down
temps scale --replicas 5

Replica Best Practices

  • Name
    Minimum 2 Replicas for Production
    Description

    Always run at least 2 replicas in production for high availability. If one fails, the other continues serving traffic.

  • Name
    Odd Numbers for Quorum
    Description

    If using consensus algorithms (Raft, etc.), use odd replica counts (3, 5, 7) to avoid split-brain scenarios.

  • Name
    Stateless Applications
    Description

    Replicas work best with stateless applications. Use external services (Redis, PostgreSQL) for state management instead of in-memory state.

  • Name
    Session Affinity
    Description

    If you need sticky sessions, enable session affinity:

    temps lb set --session-affinity cookie --project my-app
    

Resource Limits Per Environment

Configure different resource allocations for each environment:

Typical Setup

Production:

  • Higher CPU and memory for performance
  • Multiple replicas for high availability

Staging:

  • Lower resources to save costs
  • Single replica (sufficient for testing)

temps.json

{
  "resources": {
    "production": {
      "cpu": 2000,
      "memory": 2048,
      "replicas": 3
    },
    "staging": {
      "cpu": 500,
      "memory": 512,
      "replicas": 1
    }
  }
}

Monitoring Resource Usage

Real-Time Metrics

View current resource consumption:

Resource monitoring

# Real-time CPU and memory
temps metrics live --project my-app --environment production

# Historical metrics (last 24 hours)
temps metrics history --project my-app --period 24h

# Specific metric
temps metrics cpu --project my-app --tail
temps metrics memory --project my-app --tail

Dashboard Metrics

The Temps dashboard shows:

  • Current CPU %: Real-time CPU usage per replica
  • Current Memory %: Real-time memory usage per replica
  • Replica Health: Status of each replica (healthy/unhealthy)
  • Request Rate: Requests per second across all replicas
  • Response Time: Average response time (p50, p95, p99)

Alerts

Set up alerts for high resource usage:

Configure alerts

# Alert when CPU > 90% for 5 minutes
temps alert create \
  --metric cpu \
  --threshold 90 \
  --duration 5m \
  --action email \
  --recipient team@company.com

# Alert when memory > 85%
temps alert create \
  --metric memory \
  --threshold 85 \
  --duration 2m \
  --action webhook \
  --url https://hooks.slack.com/services/xxx

Cost Optimization

Right-Sizing Resources

  • Name
    Start Small
    Description

    Begin with minimal resources (500m CPU, 512 MB memory) and scale up based on actual usage. Over-provisioning wastes infrastructure capacity.

  • Name
    Monitor for 48 Hours
    Description

    After deployment, monitor resource usage for 48 hours to identify patterns (peak hours, steady state).

  • Name
    Adjust Incrementally
    Description

    Increase resources by 25-50% if you're consistently hitting limits. Don't jump from 512 MB to 4 GB—try 768 MB first.

Staging Environment Savings

Reduce Staging Resources:

  • 50% of production CPU
  • 50% of production memory
  • 1 replica

Staging is for testing, not production traffic. You don't need the same resources.

Optimized staging

{
  "resources": {
    "production": {
      "cpu": 2000,
      "memory": 2048,
      "replicas": 3
    },
    "staging": {
      "cpu": 1000,    // 50% of prod
      "memory": 1024, // 50% of prod
      "replicas": 1   // Single instance
    }
  }
}

Troubleshooting

  • Name
    High CPU Usage (>80%)
    Description

    Symptoms: Slow response times, request timeouts

    Solutions:

    • Scale Up CPU: Increase CPU allocation per replica
    • Scale Out: Add more replicas to distribute load
    • Optimize Code: Profile your application to find CPU-intensive operations
    • Enable Caching: Cache expensive computations or database queries
    # Quick fix: Add more replicas
    temps scale --replicas 5 --project my-app
    
  • Name
    High Memory Usage (>85%)
    Description

    Symptoms: Frequent OOM kills, application restarts

    Solutions:

    • Scale Up Memory: Increase memory allocation
    • Fix Memory Leaks: Use memory profiling tools (Node.js: --inspect, Python: memory_profiler)
    • Reduce In-Memory Caching: Move caching to Redis
    • Optimize Data Structures: Use streams instead of loading entire datasets into memory
    # Quick fix: Double memory
    temps resources set --memory 2048 --project my-app
    
  • Name
    Uneven Load Distribution
    Description

    Symptoms: Some replicas at 90% CPU while others at 20%

    Solutions:

    • Check Session Affinity: Disable sticky sessions if not needed
    • Verify Load Balancer: Ensure round-robin or least-connections algorithm
    • Review Long Requests: Long-running requests can "block" a replica
    # Verify load balancer config
    temps lb get --project my-app
    

Comparison with Other Platforms

FeatureTempsVercelNetlifyRailway
CPU Control✅ Configurable (100m - 16000m)❌ Fixed per plan❌ Fixed per plan✅ Configurable
Memory Control✅ Configurable (128 MB - 32 GB)❌ Fixed (1 GB)❌ Fixed (1 GB)✅ Configurable
Manual Scaling✅ Instant⚠️ Plan upgrade⚠️ Plan upgrade✅ Instant
Per-Environment✅ Yes❌ No❌ No✅ Yes
Replica Control✅ 1-100+❌ Automatic❌ Automatic✅ 1-20
Cost❌ Your infrastructure✅ $20-$500+/month✅ $19-$500+/month✅ Pay per resource

Best Practices Summary

  1. Start with 500m CPU, 512 MB memory, 2 replicas for production
  2. Monitor for 48 hours before adjusting resources
  3. Right-size staging to 50% of production resources
  4. Set up alerts for >80% CPU or >85% memory usage
  5. Profile your application to identify resource bottlenecks
  6. Use external services (Redis, PostgreSQL) instead of in-memory state
  7. Enable health checks to detect unhealthy replicas
  8. Review metrics weekly to optimize resource allocation
  9. Document your resource decisions for future reference

Next Steps

Was this page helpful?