Temps

Your Dev Environments Sit Idle 14+ Hours a Day — Scale Them to Zero

March 12, 2026 (3mo ago)

Written by Temps Team

Last updated March 12, 2026 (3mo ago)

Back to all posts

Temps

Your Dev Environments Sit Idle 14+ Hours a Day — Scale Them to Zero

March 12, 2026 (3mo ago)

Written by Temps Team

Last updated March 12, 2026 (3mo ago)

Scale-to-zero stops containers when they receive no traffic and restarts them automatically on the next HTTP request — cutting idle resource costs by 60–80% for dev, staging, and preview environments without changing how developers access them. For teams running 10–30 non-production environments, that typically means dropping from $100–300/mo to $20–60/mo in idle infrastructure spend.

TL;DR: Scale-to-zero stops idle dev and staging containers after a configurable timeout and restarts them on the next request. According to Flexera, organizations waste an average of 28% of their cloud spend on idle or underused resources. For teams running 10+ preview environments, that's $50–200/mo in pure waste that a simple idle-detection proxy eliminates.

How Much Do Always-On Preview Environments Actually Cost?

According to Flexera, organizations waste 28% of their cloud spend on idle or underused resources. Preview environments are some of the worst offenders. They run around the clock despite being used for minutes per day during code review.

Here's the math for a typical team:

10 open PRs with preview environments, each running a 512 MB container
Cost per container: $5–10/mo on most cloud providers
Active usage: maybe 2 hours per day during code review
Idle time: 22 hours per day, or 91% of the container's lifetime

That's $50–100/mo just for PR previews. Add staging, QA, and demo environments and a mid-size team easily runs 20–30 non-production environments, pushing idle costs to $200–300/mo.

The Compound Effect Across Environments

Cloud providers charge by the hour. A container that serves zero requests at 3 AM costs the same as one handling 1,000 requests per second at 3 PM. There's no built-in incentive for providers to help you stop paying for idle resources.

What Is Scale-to-Zero and How Does It Work?

Scale-to-zero is a resource management pattern where containers stop completely when they receive no traffic for a defined period. AWS Lambda pioneered serverless scale-to-zero, processing over 1 trillion invocations per month at peak by dynamically scaling functions from zero. The same principle applies to containers.

The lifecycle works in four stages:

Active phase — the container runs normally, serving requests
Idle detection — a timer tracks seconds since the last request
Sleep phase — after the idle timeout expires (say 5 minutes), the container stops
Wake phase — the next incoming request triggers a cold start, booting the container

Subsequent requests hit the running container normally. It only sleeps again after another idle timeout passes with zero traffic.

Why Not Just Use Serverless?

Serverless (Lambda, Cloud Run) does scale-to-zero natively. But preview environments aren't stateless functions — they're full applications with databases, file systems, and long-running processes. Containerized scale-to-zero gives you the cost savings of serverless with the flexibility of a full runtime.

How Does the Wake-on-Request Proxy Pattern Work?

The wake-on-request pattern uses a reverse proxy to intercept traffic, check container state, and manage the sleep/wake lifecycle. The CNCF's 2024 survey found that 84% of organizations use or evaluate Kubernetes, where similar patterns power Knative's scale-to-zero. But you don't need Kubernetes — the pattern works with plain Docker and a lightweight proxy.

Here's the full request flow:

                    ┌─────────────────┐
   HTTP Request ──> │  Reverse Proxy  │
                    │  (Nginx/Pingora)│
                    └────────┬────────┘
                             │
                    ┌────────v────────┐
                    │ Container alive? │
                    └───┬─────────┬───┘
                        │         │
                      YES         NO
                        │         │
                        │    ┌────v────────────┐
                        │    │ Hold request     │
                        │    │ Start container  │
                        │    │ Wait health check│
                        │    └────┬─────────────┘
                        │         │
                    ┌───v─────────v───┐
                    │ Forward request │
                    │ Update last_activity │
                    └────────────────┘

   Background sweeper (every 30s):
     - Check last_activity for each container
     - If idle > timeout: docker stop

The Critical Details

Three things make or break this pattern:

Request buffering. When a container is sleeping, the proxy must hold the incoming request in memory without dropping it. The client sees a slow response, not an error. If the container takes too long to wake, the proxy should return a 503 with a retry header — not hang indefinitely.

Health check timing. Don't forward the buffered request the instant docker start returns. The container process may be running but the application inside isn't ready. Poll a health endpoint (e.g., /healthz) until it returns 200, then forward.

Last-activity tracking. Every proxied request updates a timestamp. A background goroutine or thread sweeps all containers every 30 seconds, stopping any that exceeded their idle timeout. This is cheaper than per-container timers and uses roughly 100 KB of memory regardless of container count.

Can You Build Scale-to-Zero with Docker and Nginx?

Yes. A DIY solution needs three components: a proxy (Nginx or Traefik), a daemon that manages container lifecycle, and a way for the proxy to communicate with the daemon. The Docker API handles 250+ container operations per second on modest hardware, so the start/stop overhead is negligible.

Here's a minimal Python daemon that implements the core logic:

#!/usr/bin/env python3
"""Minimal scale-to-zero daemon for Docker containers."""

import time
import threading
import docker
from flask import Flask, jsonify

client = docker.from_env()
app = Flask(__name__)

# Track last request time per container
last_activity: dict[str, float] = {}
IDLE_TIMEOUT = 300  # 5 minutes
WAKE_TIMEOUT = 30   # Max seconds to wait for container
SWEEP_INTERVAL = 30

def get_container(name: str):
    try:
        return client.containers.get(name)
    except docker.errors.NotFound:
        return None

def is_healthy(container) -> bool:
    """Check if container is running and healthy."""
    container.reload()
    if container.status != "running":
        return False
    health = container.attrs.get("State", {}).get("Health")
    if health is None:
        return True  # No healthcheck defined, assume ready
    return health.get("Status") == "healthy"

def wake_container(name: str) -> bool:
    """Start a stopped container and wait for health."""
    container = get_container(name)
    if not container:
        return False

    if container.status == "running":
        last_activity[name] = time.time()
        return True

    container.start()
    deadline = time.time() + WAKE_TIMEOUT

    while time.time() < deadline:
        if is_healthy(container):
            last_activity[name] = time.time()
            return True
        time.sleep(0.5)

    return False

@app.route("/wake/<name>", methods=["POST"])
def handle_wake(name: str):
    """Nginx calls this via auth_request or proxy_pass."""
    if wake_container(name):
        return jsonify({"status": "ready"}), 200
    return jsonify({"status": "timeout"}), 503

@app.route("/activity/<name>", methods=["POST"])
def handle_activity(name: str):
    """Called on every proxied request to update timestamp."""
    last_activity[name] = time.time()
    return "", 204

def idle_sweeper():
    """Background thread that stops idle containers."""
    while True:
        time.sleep(SWEEP_INTERVAL)
        now = time.time()
        for name, last_seen in list(last_activity.items()):
            if now - last_seen > IDLE_TIMEOUT:
                container = get_container(name)
                if container and container.status == "running":
                    print(f"Stopping idle container: {name}")
                    container.stop(timeout=10)
                    del last_activity[name]

if __name__ == "__main__":
    sweeper = threading.Thread(target=idle_sweeper, daemon=True)
    sweeper.start()
    app.run(host="127.0.0.1", port=9090)

Nginx Configuration for Wake-on-Request

The Nginx config uses auth_request to call the daemon before proxying:

server {
    listen 80;
    server_name ~^(?<container_name>.+)\.preview\.example\.com$;

    # Wake the container before forwarding
    auth_request /internal/wake;

    location /internal/wake {
        internal;
        proxy_pass http://127.0.0.1:9090/wake/$container_name;
        proxy_read_timeout 35s;  # Slightly above WAKE_TIMEOUT
    }

    location / {
        proxy_pass http://$container_name:3000;
        proxy_set_header Host $host;

        # Track activity asynchronously
        post_action @track_activity;
    }

    location @track_activity {
        internal;
        proxy_pass http://127.0.0.1:9090/activity/$container_name;
    }
}

The Tricky Parts

This basic setup works, but it has gaps you'll hit in production:

WebSocket handling. The auth_request fires once on connection upgrade, but WebSocket connections stay open. You need to track WebSocket connection count separately and only consider a container idle when both HTTP requests and WebSocket connections are zero.

Concurrent wake requests. If 10 requests arrive simultaneously for a sleeping container, all 10 trigger a wake. Add a per-container lock so only the first request starts the container; the rest wait on the lock.

Container networking. Stopped containers lose their DNS entry in Docker's internal network. You may need to use docker pause/docker unpause instead if you rely on Docker DNS.

The biggest source of bugs in DIY scale-to-zero isn't the wake logic — it's the concurrent request handling. Without proper locking, you get race conditions where two threads both see "container stopped" and both call docker start, causing errors.

Platform Comparison: Scale-to-Zero for Dev Environments

Feature	Temps	Vercel	Render
Preview env scale-to-zero	Yes (`on_demand: true`)	No (always-on)	No (always-on)
Idle timeout (configurable)	60s–86400s	N/A	N/A
Wake timeout (configurable)	5s–120s	N/A	N/A
Per-environment config	Yes	No	No
Project-level default	Yes (`preview_envs_on_demand`)	No	No
Self-hosted	Yes, free (Apache 2.0)	No	No
Proxy layer	Pingora (Cloudflare-built Rust)	Proprietary	Proprietary

Vercel and Render run preview environments continuously — there's no built-in idle stop for containers. Temps implements scale-to-zero at the proxy layer using Pingora (Cloudflare's open-source Rust proxy), so the decision and the action happen in the same process with no extra network hop.

How Does Temps Implement Scale-to-Zero?

Temps calls this feature on-demand environments. Set on_demand: true in your environment configuration and containers automatically stop after idle_timeout_seconds of no traffic (default: 300 seconds, range: 60–86400). They wake automatically when the next HTTP request arrives, within wake_timeout_seconds (default: 30 seconds, range: 5–120).

Configuration

Three parameters control on-demand behavior per environment:

{
  "on_demand": true,
  "idle_timeout_seconds": 300,
  "wake_timeout_seconds": 30
}

on_demand — enables scale-to-zero for this environment
idle_timeout_seconds — seconds of inactivity before containers stop (60–86400)
wake_timeout_seconds — max seconds to wait for wake on the next request (5–120)

Project-Level Default for Preview Environments

Set preview_envs_on_demand: true on a project and every newly auto-created preview environment inherits on-demand mode automatically. You can also set project-level defaults for preview_envs_idle_timeout_seconds (default: 300) and preview_envs_wake_timeout_seconds (default: 30). Existing environments are not affected — only previews created after enabling the flag.

This is the key operational win: instead of configuring each preview environment individually, you opt in once at the project level and every future PR preview scales to zero by default.

How the Proxy Handles Wake

Temps uses Pingora (Cloudflare's open-source Rust proxy, the same proxy layer that serves Cloudflare's own traffic) as its reverse proxy. When a request arrives for a sleeping on-demand environment:

Pingora checks the environment's state
If sleeping, it buffers the incoming request
The control plane starts the container via the Docker API
Pingora waits for the health check to pass
The buffered request is forwarded to the now-running container
Subsequent requests flow through normally until the idle timeout triggers again

Wake time is typically 2–5 seconds because the container image is already pulled and cached — no rebuild required. The proxy and orchestrator live in the same binary, eliminating the network hop between "check state" and "start container."

Savings in Practice

Enabling on-demand mode across all preview environments typically cuts non-production resource usage by 60–80%. For a team with 20 preview environments at $5–10/container/month, that means the bill drops from $100–200/mo to $20–60/mo — just from enabling a single flag.

How Do You Minimize Cold Start Times?

Cold start latency is the main tradeoff of scale-to-zero. Google Cloud Run reports median cold starts of 1–3 seconds for optimized containers, but unoptimized ones can take 10–30 seconds. The difference comes down to image size, startup dependencies, and which stop mechanism you use.

Keep Images Small

Every megabyte of image size adds to pull and extract time. Alpine-based images are typically 5–30 MB compared to 200–800 MB for Debian-based ones.

# Unoptimized: ~850 MB image, ~8-second cold start
FROM node:22
COPY . .
RUN npm install
CMD ["node", "server.js"]

# Optimized: ~45 MB image, ~2-second cold start
FROM node:22-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .

FROM node:22-alpine
WORKDIR /app
COPY --from=builder /app .
CMD ["node", "server.js"]

Use pause/unpause Instead of stop/start

docker pause freezes container processes using cgroups. Memory stays allocated but CPU drops to zero. docker unpause resumes in under 100 ms — no process startup involved.

The tradeoff: paused containers still consume RAM. For dev environments with 512 MB containers, this is often worth it. For staging environments running 2 GB containers, stop/start with the slower wake time might save more overall.

Set Appropriate Idle Timeouts

Environment	Recommended Timeout	Rationale
PR previews	5 minutes	Reviewed once, rarely revisited
Development	10–15 minutes	Active work, short breaks
Staging	30–60 minutes	QA sessions, longer gaps between tests
Demo	15–30 minutes	Client meetings, unpredictable pauses

When Should You NOT Use Scale-to-Zero?

Scale-to-zero is not universal. Here's where it causes more problems than it solves:

Production environments. Any environment where users expect sub-second responses. According to Amazon's research, 100 ms of latency costs 1% in sales. A 2–5 second cold start on the first request after idle is unacceptable for customer-facing applications.

CI/CD pipeline targets. Automated tests and deployment pipelines expect instant responses. A sleeping environment introduces flaky test results because the first request takes abnormally long or times out.

WebSocket-heavy applications. When containers sleep, all WebSocket connections drop. Applications need client-side reconnection logic, and if the app relies on persistent connections (real-time collaboration, live dashboards), the reconnection storm after a wake can overwhelm the freshly started container.

Long-running background jobs. Containers running cron jobs, queue workers, or batch processing should never scale to zero — their "traffic" isn't HTTP-based and they need to be running continuously.

The Right Environments for Scale-to-Zero

Scale-to-zero shines in environments with bursty, human-driven access patterns:

PR preview environments (accessed during code review only)
Development environments (active 8 hours, idle 16)
QA/staging (used during test sessions, idle between)
Demo environments (active during sales calls)
Documentation preview (accessed during writing sessions)

Frequently Asked Questions

How long does a cold start take with scale-to-zero?

Cold start duration depends on the stop mechanism and container size. Using docker pause/unpause, wake time is under 100 ms. Using docker stop/start, expect 1–5 seconds for optimized containers and up to 30 seconds for large, unoptimized ones. Google Cloud Run benchmarks show median cold starts of 1–3 seconds for optimized images.

Can I use scale-to-zero for production?

Not recommended for user-facing production. The cold start penalty creates unacceptable latency for the first visitor after an idle period. Reserve scale-to-zero for dev, staging, preview, and demo environments where occasional 2–5 second delays are acceptable.

What happens to WebSocket connections during scale-to-zero?

All WebSocket connections drop when a container sleeps. Clients receive a close frame (or a TCP reset if the stop is abrupt). Applications need client-side reconnection logic — most WebSocket libraries support automatic reconnect with exponential backoff. The first reconnect triggers a container wake, so expect a 2–5 second delay before the connection reestablishes.

How much money does scale-to-zero actually save?

Savings depend on idle time percentage. Most dev and staging environments sit idle 70–90% of the time. According to Flexera, organizations waste 28% of cloud spend on idle resources. For a team running 20 non-production environments at $5–10/container/month, scale-to-zero cuts the bill from $100–200/mo to $20–60/mo — a 60–80% reduction.

Does scale-to-zero work with databases?

The container's application stops, not the database. Database connections are closed when the container sleeps and re-established on wake. Use connection pooling (PgBouncer, Prisma connection pool) so the reconnection is fast. Database containers themselves should NOT use scale-to-zero — they need to persist data and maintain availability.

Does Temps support scale-to-zero for preview environments?

Yes. Set on_demand: true on any environment, or enable preview_envs_on_demand at the project level so every new PR preview inherits it automatically. Idle timeout defaults to 300 seconds (5 minutes), configurable from 60 seconds to 24 hours. Wake timeout defaults to 30 seconds, configurable from 5 to 120 seconds.

Start Saving on Idle Environments

Scale-to-zero is one of those rare optimizations that's pure upside for non-production environments. You save 60–80% on idle resources with a tradeoff that barely matters: a few seconds of cold start latency affecting only the first request after an idle period.

You can build it yourself with Docker, a lightweight proxy, and about 100 lines of daemon code. Or use a platform that handles the proxy buffering, health checking, and lifecycle management out of the box — with self-hosted pricing that doesn't add another SaaS bill.

Temps is Apache 2.0, self-hosted for free, or available on Temps Cloud at ~$6/mo (Hetzner cost + 30%). No per-seat fees, no bandwidth bills.

curl -fsSL temps.sh/install.sh | bash

Set on_demand: true on any environment, configure your idle timeout, and watch your resource usage drop while your environments stay accessible on demand.

Back to all posts

TL;DR: Scale-to-zero stops idle dev and staging containers after a configurable timeout and restarts them on the next request. According to Flexera, organizations waste an average of 28% of their cloud spend on idle or underused resources. For teams running 10+ preview environments, that's $50–200/mo in pure waste that a simple idle-detection proxy eliminates.

How Much Do Always-On Preview Environments Actually Cost?

Here's the math for a typical team:

10 open PRs with preview environments, each running a 512 MB container
Cost per container: $5–10/mo on most cloud providers
Active usage: maybe 2 hours per day during code review
Idle time: 22 hours per day, or 91% of the container's lifetime

That's $50–100/mo just for PR previews. Add staging, QA, and demo environments and a mid-size team easily runs 20–30 non-production environments, pushing idle costs to $200–300/mo.

The Compound Effect Across Environments

What Is Scale-to-Zero and How Does It Work?

The lifecycle works in four stages:

Active phase — the container runs normally, serving requests
Idle detection — a timer tracks seconds since the last request
Sleep phase — after the idle timeout expires (say 5 minutes), the container stops
Wake phase — the next incoming request triggers a cold start, booting the container

Subsequent requests hit the running container normally. It only sleeps again after another idle timeout passes with zero traffic.

Why Not Just Use Serverless?

How Does the Wake-on-Request Proxy Pattern Work?

Here's the full request flow:

                    ┌─────────────────┐
   HTTP Request ──> │  Reverse Proxy  │
                    │  (Nginx/Pingora)│
                    └────────┬────────┘
                             │
                    ┌────────v────────┐
                    │ Container alive? │
                    └───┬─────────┬───┘
                        │         │
                      YES         NO
                        │         │
                        │    ┌────v────────────┐
                        │    │ Hold request     │
                        │    │ Start container  │
                        │    │ Wait health check│
                        │    └────┬─────────────┘
                        │         │
                    ┌───v─────────v───┐
                    │ Forward request │
                    │ Update last_activity │
                    └────────────────┘

   Background sweeper (every 30s):
     - Check last_activity for each container
     - If idle > timeout: docker stop

The Critical Details

Three things make or break this pattern:

Can You Build Scale-to-Zero with Docker and Nginx?

Here's a minimal Python daemon that implements the core logic:

#!/usr/bin/env python3
"""Minimal scale-to-zero daemon for Docker containers."""

import time
import threading
import docker
from flask import Flask, jsonify

client = docker.from_env()
app = Flask(__name__)

# Track last request time per container
last_activity: dict[str, float] = {}
IDLE_TIMEOUT = 300  # 5 minutes
WAKE_TIMEOUT = 30   # Max seconds to wait for container
SWEEP_INTERVAL = 30

def get_container(name: str):
    try:
        return client.containers.get(name)
    except docker.errors.NotFound:
        return None

def is_healthy(container) -> bool:
    """Check if container is running and healthy."""
    container.reload()
    if container.status != "running":
        return False
    health = container.attrs.get("State", {}).get("Health")
    if health is None:
        return True  # No healthcheck defined, assume ready
    return health.get("Status") == "healthy"

def wake_container(name: str) -> bool:
    """Start a stopped container and wait for health."""
    container = get_container(name)
    if not container:
        return False

    if container.status == "running":
        last_activity[name] = time.time()
        return True

    container.start()
    deadline = time.time() + WAKE_TIMEOUT

    while time.time() < deadline:
        if is_healthy(container):
            last_activity[name] = time.time()
            return True
        time.sleep(0.5)

    return False

@app.route("/wake/<name>", methods=["POST"])
def handle_wake(name: str):
    """Nginx calls this via auth_request or proxy_pass."""
    if wake_container(name):
        return jsonify({"status": "ready"}), 200
    return jsonify({"status": "timeout"}), 503

@app.route("/activity/<name>", methods=["POST"])
def handle_activity(name: str):
    """Called on every proxied request to update timestamp."""
    last_activity[name] = time.time()
    return "", 204

def idle_sweeper():
    """Background thread that stops idle containers."""
    while True:
        time.sleep(SWEEP_INTERVAL)
        now = time.time()
        for name, last_seen in list(last_activity.items()):
            if now - last_seen > IDLE_TIMEOUT:
                container = get_container(name)
                if container and container.status == "running":
                    print(f"Stopping idle container: {name}")
                    container.stop(timeout=10)
                    del last_activity[name]

if __name__ == "__main__":
    sweeper = threading.Thread(target=idle_sweeper, daemon=True)
    sweeper.start()
    app.run(host="127.0.0.1", port=9090)

Nginx Configuration for Wake-on-Request

The Nginx config uses auth_request to call the daemon before proxying:

server {
    listen 80;
    server_name ~^(?<container_name>.+)\.preview\.example\.com$;

    # Wake the container before forwarding
    auth_request /internal/wake;

    location /internal/wake {
        internal;
        proxy_pass http://127.0.0.1:9090/wake/$container_name;
        proxy_read_timeout 35s;  # Slightly above WAKE_TIMEOUT
    }

    location / {
        proxy_pass http://$container_name:3000;
        proxy_set_header Host $host;

        # Track activity asynchronously
        post_action @track_activity;
    }

    location @track_activity {
        internal;
        proxy_pass http://127.0.0.1:9090/activity/$container_name;
    }
}

The Tricky Parts

This basic setup works, but it has gaps you'll hit in production:

Container networking. Stopped containers lose their DNS entry in Docker's internal network. You may need to use docker pause/docker unpause instead if you rely on Docker DNS.

Platform Comparison: Scale-to-Zero for Dev Environments

Feature	Temps	Vercel	Render
Preview env scale-to-zero	Yes (`on_demand: true`)	No (always-on)	No (always-on)
Idle timeout (configurable)	60s–86400s	N/A	N/A
Wake timeout (configurable)	5s–120s	N/A	N/A
Per-environment config	Yes	No	No
Project-level default	Yes (`preview_envs_on_demand`)	No	No
Self-hosted	Yes, free (Apache 2.0)	No	No
Proxy layer	Pingora (Cloudflare-built Rust)	Proprietary	Proprietary

How Does Temps Implement Scale-to-Zero?

Configuration

Three parameters control on-demand behavior per environment:

{
  "on_demand": true,
  "idle_timeout_seconds": 300,
  "wake_timeout_seconds": 30
}

on_demand — enables scale-to-zero for this environment
idle_timeout_seconds — seconds of inactivity before containers stop (60–86400)
wake_timeout_seconds — max seconds to wait for wake on the next request (5–120)

Project-Level Default for Preview Environments

This is the key operational win: instead of configuring each preview environment individually, you opt in once at the project level and every future PR preview scales to zero by default.

How the Proxy Handles Wake

Temps uses Pingora (Cloudflare's open-source Rust proxy, the same proxy layer that serves Cloudflare's own traffic) as its reverse proxy. When a request arrives for a sleeping on-demand environment:

Pingora checks the environment's state
If sleeping, it buffers the incoming request
The control plane starts the container via the Docker API
Pingora waits for the health check to pass
The buffered request is forwarded to the now-running container
Subsequent requests flow through normally until the idle timeout triggers again

Savings in Practice

How Do You Minimize Cold Start Times?

Keep Images Small

Every megabyte of image size adds to pull and extract time. Alpine-based images are typically 5–30 MB compared to 200–800 MB for Debian-based ones.

# Unoptimized: ~850 MB image, ~8-second cold start
FROM node:22
COPY . .
RUN npm install
CMD ["node", "server.js"]

# Optimized: ~45 MB image, ~2-second cold start
FROM node:22-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .

FROM node:22-alpine
WORKDIR /app
COPY --from=builder /app .
CMD ["node", "server.js"]

Use pause/unpause Instead of stop/start

docker pause freezes container processes using cgroups. Memory stays allocated but CPU drops to zero. docker unpause resumes in under 100 ms — no process startup involved.

Set Appropriate Idle Timeouts

Environment	Recommended Timeout	Rationale
PR previews	5 minutes	Reviewed once, rarely revisited
Development	10–15 minutes	Active work, short breaks
Staging	30–60 minutes	QA sessions, longer gaps between tests
Demo	15–30 minutes	Client meetings, unpredictable pauses

When Should You NOT Use Scale-to-Zero?

Scale-to-zero is not universal. Here's where it causes more problems than it solves:

The Right Environments for Scale-to-Zero

Scale-to-zero shines in environments with bursty, human-driven access patterns:

PR preview environments (accessed during code review only)
Development environments (active 8 hours, idle 16)
QA/staging (used during test sessions, idle between)
Demo environments (active during sales calls)
Documentation preview (accessed during writing sessions)

curl -fsSL temps.sh/install.sh | bash

Set on_demand: true on any environment, configure your idle timeout, and watch your resource usage drop while your environments stay accessible on demand.

Your Dev Environments Sit Idle 14+ Hours a Day — Scale Them to Zero | Temps