Zero-Downtime Deployments: How Temps Keeps Your App Live During Every Deploy

You click deploy. For the next 30 seconds, some percentage of your users see errors, broken pages, or timeout screens. Then everything works again — until the next deploy.

This is the reality of naive deployment strategies. And it's completely avoidable.

Temps performs zero-downtime deployments by default. No configuration, no special setup, no blue-green scripts to maintain. This guide explains how it works and why it matters more than you think.

Why Downtime During Deployments Happens

The Simple (Broken) Approach

Most basic deployment workflows look like this:

Stop the old version
Start the new version
Wait for it to be ready

The gap between steps 1 and 2 is downtime. Even if it's only 5 seconds, multiply that by 10 deploys per day and your users experience nearly a minute of errors — every single day.

The "Fast Enough" Myth

Some teams argue their deploys are fast enough that nobody notices. Here's why that's wrong:

Active requests are dropped — any in-flight HTTP request during the swap returns a connection error
WebSocket connections break — real-time features disconnect all connected clients
Background jobs crash — long-running tasks are killed mid-execution
CDN cache misses — during the gap, CDN requests hit a dead origin and cache error responses
Health checks fail — load balancers route traffic to a container that isn't ready yet

A 5-second deploy gap isn't "fast enough" when 200 users are actively making requests.

How Zero-Downtime Deployment Works

Zero-downtime deployment ensures the old version keeps serving traffic until the new version is fully ready. There are several strategies, and Temps uses a combination of the best ones.

Rolling Deployment

The simplest zero-downtime strategy:

Time 0:  [v1] [v1] [v1]     <- All instances running v1
Time 1:  [v1] [v1] [v2...]  <- One instance starts v2 (not yet ready)
Time 2:  [v1] [v1] [v2]     <- v2 passes health check, receives traffic
Time 3:  [v1] [v2] [v2...]  <- Next instance starts upgrading
Time 4:  [v1] [v2] [v2]     <- Second v2 ready
Time 5:  [v2...] [v2] [v2]  <- Final instance upgrading
Time 6:  [v2] [v2] [v2]     <- All instances on v2. Zero dropped requests.

At no point are zero instances available. Traffic always has somewhere to go.

Health Check Gating

A new container isn't added to the load balancer until it passes health checks:

Container starts -> Runs health check -> Passes -> Receives traffic
                                      -> Fails  -> Retry (up to timeout)
                                                 -> Roll back if persistent

This prevents the common failure mode where traffic is routed to a container that's still initializing (loading config, warming caches, establishing database connections).

Connection Draining

When an old container is being removed, it doesn't just get killed. Instead:

Stop sending new requests to the old container
Wait for in-flight requests to complete (up to a configurable timeout)
Only then shut down the container

This ensures that a user who started a checkout flow doesn't get an error mid-payment because their server disappeared.

How Temps Implements Zero-Downtime

What Happens When You Deploy

git push origin main
# or
bunx @temps-sdk/cli deploy my-app -e production -y

Behind the scenes:

Build phase — Your application is built in a fresh container. The old version continues serving traffic undisturbed.
Health check — The new container starts up. Temps sends HTTP requests to your health endpoint (default: /) until it gets a 200 response.
Traffic shift — Once healthy, the new container is added to the load balancer. The old container is marked for draining.
Connection draining — The old container finishes serving any in-flight requests. New requests go to the new container only.
Cleanup — After draining completes (or timeout), the old container is removed.

Total user-visible downtime: zero.

Default Health Check Configuration

Temps uses sensible defaults that work for most applications:

Setting	Default	Description
Path	`/`	HTTP endpoint to check
Interval	5 seconds	Time between checks
Timeout	3 seconds	Max time for a response
Healthy threshold	2	Consecutive successes needed
Unhealthy threshold	3	Consecutive failures before rollback
Start period	30 seconds	Grace period for startup

Custom Health Checks

For applications that need more sophisticated readiness checks:

// app/api/health/route.ts (Next.js)
import { db } from "@/lib/database";
import { redis } from "@/lib/redis";

export async function GET() {
  try {
    // Check database connectivity
    await db.execute("SELECT 1");

    // Check Redis connectivity
    await redis.ping();

    // Check any other critical dependencies
    return Response.json({
      status: "healthy",
      timestamp: new Date().toISOString(),
      checks: {
        database: "ok",
        redis: "ok",
      },
    });
  } catch (error) {
    return Response.json(
      { status: "unhealthy", error: error.message },
      { status: 503 },
    );
  }
}

Configure the health check path in your project settings via the Temps dashboard. Navigate to your project settings and set the health check endpoint to /api/health.

The new container won't receive traffic until the database and Redis connections are verified.

Automatic Rollback: Your Safety Net

Zero-downtime deployment isn't just about the happy path. It's equally important when things go wrong.

When Rollback Triggers

Temps automatically rolls back when:

Health checks fail — The new container never becomes healthy
Error rate spikes — The new version produces significantly more errors than the old one
Crash loops — The new container keeps crashing and restarting

How Rollback Works

Deploy v2 -> Health check fails 3 times -> Roll back
  Result: v1 continues serving. Users never saw v2.

Deploy v2 -> Health check passes -> Error rate spikes 10x -> Roll back
  Result: v2 is removed, v1 takes all traffic again.

In both cases, users experience zero downtime and zero errors. The broken version never reaches them (or is removed before it causes significant damage).

Manual Rollback

Sometimes you discover an issue that automated checks don't catch — a visual bug, a wrong calculation, a feature that shouldn't have shipped. Rollback instantly:

# Roll back to the previous deployment
bunx @temps-sdk/cli deployments rollback -p my-app -e production

# Roll back to a specific deployment
bunx @temps-sdk/cli deployments rollback -p my-app --to 42

Rollback takes seconds because the previous container image is cached. No rebuild needed.

Database Migrations and Zero-Downtime

The trickiest part of zero-downtime deployment is database migrations. If your new code expects a column that doesn't exist yet, or your old code breaks when a column is removed, you have a problem.

The Safe Migration Pattern

Follow the expand-and-contract pattern:

Phase 1: Expand (backward-compatible)

-- Add the new column, but don't remove the old one
ALTER TABLE users ADD COLUMN full_name TEXT;

-- Backfill data
UPDATE users SET full_name = first_name || ' ' || last_name;

Deploy code that writes to both columns but reads from the new one.

Phase 2: Migrate

Deploy code that only uses the new column. At this point both old and new versions can coexist because the old column still exists.

Phase 3: Contract (cleanup)

-- Safe to remove after all instances are on the new version
ALTER TABLE users DROP COLUMN first_name;
ALTER TABLE users DROP COLUMN last_name;

Why This Matters

During a rolling deployment, both v1 and v2 of your application run simultaneously. If v2 requires a schema that breaks v1, rolling deployment fails. The expand-and-contract pattern ensures both versions work with the same database schema.

Comparing Deployment Strategies

Strategy	Downtime	Complexity	Rollback Speed	Resource Cost
Stop-start	5-60 seconds	None	Minutes (rebuild)	1x
Rolling (Temps default)	Zero	Low (automatic)	Seconds	1.3x during deploy
Blue-green	Zero	Medium	Seconds	2x always
Canary	Zero	High	Seconds	1.1x during deploy

Temps uses rolling deployment by default because it's the best balance of zero downtime, low resource overhead, and simplicity. No duplicate infrastructure running 24/7.

Zero-Downtime for Different App Types

Next.js / Server-Side Rendered Apps

Works automatically. Server Components render on the new container once traffic shifts. Client-side React hydration handles the transition seamlessly.

API Servers (Express, Fastify, Django, FastAPI)

Health checks verify the API is responding before traffic shifts. In-flight requests complete on the old container via connection draining.

WebSocket Applications

WebSocket connections to the old container are maintained during draining. New connections go to the new container. Clients with reconnection logic (most WebSocket libraries have this) reconnect transparently.

// Client-side: most WebSocket libraries handle reconnection
const socket = io("wss://myapp.com", {
  reconnection: true,
  reconnectionDelay: 1000,
  reconnectionAttempts: 5,
});

Background Workers

Long-running jobs on the old container are given a grace period to complete. Configure the drain timeout based on your longest expected job:

Configure the drain timeout in your project's deployment settings via the dashboard, allowing containers enough time to finish in-progress work before shutting down.

Monitoring Deployments

Temps gives you real-time visibility into every deployment:

Deployment Timeline

14:30:00  Build started
14:30:45  Build completed (image: 142MB)
14:30:48  New container starting
14:31:02  Health check passed (attempt 3)
14:31:03  Traffic shifting to new container
14:31:03  Old container draining (12 active connections)
14:31:18  All connections drained
14:31:18  Old container removed
14:31:18  Deployment complete. Zero errors.

Deploy Metrics

After each deployment, Temps shows:

Deploy duration — Total time from push to live
Health check time — How long until the new version was ready
Drain time — How long in-flight requests took to complete
Error rate — Before, during, and after deployment
Response time — Any latency impact from the deployment

Alerting

Set up notifications for deployment issues via the Temps dashboard or CLI:

# Add a Slack notification provider for deployment alerts
bunx @temps-sdk/cli notifications add --type slack --name "Deploy Alerts" \
  --webhook-url https://hooks.slack.com/... --channel "#deploys" -y

# Or add email notifications
bunx @temps-sdk/cli notifications add --type email --name "Deploy Email" \
  --smtp-host smtp.gmail.com --smtp-port 587 \
  --smtp-user user@gmail.com --smtp-pass apppassword \
  --from alerts@example.com --to team@example.com -y

Best Practices

1. Keep Health Checks Honest

Your health endpoint should verify real dependencies, not just return 200:

// Bad: always returns healthy
app.get("/health", () => ({ status: "ok" }));

// Good: verifies actual readiness
app.get("/health", async () => {
  await db.query("SELECT 1");
  await cache.ping();
  return { status: "ok" };
});

2. Make Deploys Small and Frequent

Small deploys are easier to roll back and less likely to cause issues. Deploy 10 times a day instead of once a week.

3. Use Database Migrations Carefully

Always use the expand-and-contract pattern. Never drop a column in the same deploy that stops using it.

4. Test Locally Before Deploying

Temps preview deployments let you test in a production-like environment without affecting real users:

# Deploy a preview from your branch
git push origin feature/new-checkout
# Preview URL: pr-42-myapp.temps.sh

5. Monitor Post-Deploy

Don't assume a successful deploy means everything is fine. Watch error rates and response times for 15 minutes after each deployment.

Getting Started

Zero-downtime deployment is the default with Temps. No configuration needed:

# Install
curl -fsSL https://temps.sh/deploy.sh | bash

# Login and deploy
bunx @temps-sdk/cli login
bunx @temps-sdk/cli deploy -p my-app -e production -y

Every deploy, every time, zero downtime.

Want to learn more about deployment strategies? Check our deployment documentation for advanced configuration options.