March 12, 2026 (1mo ago)
Written by Temps Team
Last updated March 12, 2026 (1mo ago)
You push a new version and for five to thirty seconds, some users see errors. In-flight requests drop. WebSocket connections break. It happens every deploy, and most teams just accept it.
Here's the thing: zero-downtime deployment is achievable without Kubernetes. You don't need a container orchestration platform, a service mesh, or a dedicated SRE team. You need three things — health checks, connection draining, and an atomic route switch — layered on top of Docker containers you're already running.
This guide walks through why Docker deployments have downtime by default, how to build a zero-downtime pipeline from scratch with Docker Compose and Nginx, and how to skip the DIY work entirely if you'd rather not maintain deployment scripts forever.
TL;DR: Docker's default stop-start cycle creates a 5-30 second gap where requests fail. You can eliminate it with health check gating, connection draining, and blue-green container swaps. Elite engineering teams deploy multiple times per day with a 5% change failure rate. This guide shows both the DIY approach and a one-command alternative.
Docker's default lifecycle creates an unavoidable gap between stopping the old container and starting the new one. According to ITIC, 91% of mid-size and large enterprises report that a single hour of downtime costs over $300,000. Even brief deployment windows — repeated across multiple daily deploys — compound fast.
When you run docker compose up -d --build, Docker stops the old container, removes it, builds the new image, and starts a fresh container. That sequence has three gaps where requests fail:
During these gaps, any request hitting your server gets a 502 Bad Gateway or a connection refused error. That's downtime.
restart: always Doesn't HelpA common misconception: setting restart: always in your docker-compose.yml gives you zero-downtime deploys. It doesn't. This directive tells Docker to restart the same container when it crashes. It doesn't spin up a new version alongside the old one.
# This does NOT give you zero-downtime deployment
services:
web:
image: myapp:latest
restart: always # Only restarts the SAME container on crash
What you actually need is two containers running simultaneously — the old version serving traffic while the new version boots up and passes health checks. That's a fundamentally different pattern.
docker compose up -d Replaces In-PlaceRunning docker compose up -d with an updated image does a stop-then-start on the same service. It doesn't create a parallel instance. Even docker compose up -d --scale web=2 won't orchestrate a graceful handoff. You'd end up with two containers behind no load balancer, both receiving traffic with no health gating.
The core problem: Docker Compose is a development tool that happens to work in production. It wasn't designed for zero-downtime deployments. You need to layer your own orchestration on top.
Three strategies dominate zero-downtime deployment, each with different trade-offs in complexity, cost, and risk. According to the 2024 DORA report, elite-performing teams deploy on-demand with a change failure rate of just 5%, while low performers deploy monthly with a 64% failure rate. The strategy you pick affects how fast you can recover from that failure.
Blue-green keeps two identical environments running. "Blue" serves production traffic. "Green" gets the new version. Once green passes health checks, the load balancer switches all traffic in one atomic step.
How it works:
The advantage is simplicity: one clean switch, one clean rollback. The downside is cost — you're running two full environments at all times. For a single-server Docker setup, this means doubling your container resources permanently.
Rolling updates replace instances one at a time. The old version keeps serving while new instances spin up and pass health checks.
Time 0: [v1] [v1] [v1] ← All running v1
Time 1: [v1] [v1] [v2...] ← One instance boots v2
Time 2: [v1] [v1] [v2 ✓] ← v2 passes check, takes traffic
Time 3: [v1] [v2 ✓] [v2 ✓] ← Second instance upgraded
Time 4: [v2 ✓] [v2 ✓] [v2 ✓] ← Complete, zero dropped requests
Rolling deployment uses about 1.3x resources during the deploy — not 2x permanently. But it requires multiple instances, which makes it less practical on a single server with one container.
Canary sends a small percentage of traffic (say 5%) to the new version first. If error rates stay flat, traffic gradually shifts — 5%, 25%, 50%, 100%. If anything goes wrong, only that small slice of users was affected.
This is the safest approach for high-traffic applications. But it's also the most complex to implement. You need traffic splitting at the load balancer level, per-version metrics collection, and automated promotion logic.
| Strategy | Resource Cost | Rollback Speed | Complexity | Best For |
|---|---|---|---|---|
| Blue-green | 2x always | Instant | Low | Single-server Docker apps |
| Rolling | 1.3x during deploy | Seconds | Medium | Multi-instance clusters |
| Canary | 1.1x during deploy | Seconds | High | High-traffic production |
For most Docker apps running on a single server, blue-green is the practical choice. You can implement it with two containers and an Nginx reload. No cluster required.
Every zero-downtime strategy relies on three mechanisms working together: health check gating, connection draining, and atomic routing. According to New Relic, organizations with full-stack observability experience 71% fewer annual outages. These three ingredients are the foundation of that observability at the deployment layer.
The new container should never receive traffic until it's genuinely ready. A health check endpoint verifies that your application booted, connected to its database, and can serve requests.
// Express.js health check that verifies real dependencies
app.get('/health', async (req, res) => {
try {
await db.query('SELECT 1');
await redis.ping();
res.status(200).json({ status: 'healthy' });
} catch (err) {
res.status(503).json({ status: 'unhealthy', error: err.message });
}
});
A health endpoint that blindly returns 200 defeats the entire purpose. If your app returns "healthy" before the database connection pool is established, the load balancer will route traffic to a container that immediately throws 500 errors.
Docker's built-in HEALTHCHECK instruction helps, but it's not enough on its own. Docker health checks only affect container status — they don't control your load balancer. You need your deployment script to check health before switching traffic.
When you remove the old container from the load balancer, don't kill it immediately. In-flight requests — a user mid-checkout, a file upload at 90%, a long-polling connection — need time to complete.
Connection draining means:
Without draining, you'll randomly drop requests during every deploy. Users won't see a full outage, but they'll get sporadic 502 errors that are hard to reproduce and diagnose.
The load balancer needs to flip from old to new in one step. Not gradually, not with a gap — atomically. For Nginx, this is a config reload:
nginx -s reload
Nginx's reload is graceful: it starts new worker processes with the updated config, and old worker processes finish their current requests before exiting. That's atomic routing and connection draining in one operation — but only for the Nginx layer. Your application containers still need their own draining logic.
Here's a working blue-green deployment pipeline you can implement today on any single server. According to the CNCF annual survey, 82% of container users now run Kubernetes in production, but you don't need to be one of them. This approach uses Docker Compose, Nginx, and a 60-line bash script.
Define two services — web-blue and web-green — so both can run simultaneously during the transition.
# docker-compose.yml
services:
web-blue:
build: .
container_name: app-blue
ports:
- "8001:3000"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 5s
timeout: 3s
retries: 3
start_period: 10s
restart: unless-stopped
web-green:
build: .
container_name: app-green
ports:
- "8002:3000"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 5s
timeout: 3s
retries: 3
start_period: 10s
restart: unless-stopped
Port 8001 maps to the blue container. Port 8002 maps to green. Nginx sits in front and routes to whichever is currently active.
Create two upstream configs that Nginx can switch between:
# /etc/nginx/conf.d/app.conf
upstream app_backend {
# This file gets overwritten by the deploy script
include /etc/nginx/conf.d/active-upstream.conf;
}
server {
listen 80;
server_name myapp.com;
location / {
proxy_pass http://app_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_connect_timeout 5s;
proxy_read_timeout 30s;
}
}
# /etc/nginx/conf.d/active-upstream.conf
# Points to blue by default
server 127.0.0.1:8001;
This is where the magic happens. The script determines which slot is active, deploys to the idle slot, waits for health checks, switches Nginx, and drains the old container.
#!/bin/bash
set -euo pipefail
# Configuration
HEALTH_ENDPOINT="http://localhost:PORT/health"
MAX_RETRIES=30
RETRY_INTERVAL=2
DRAIN_WAIT=10
UPSTREAM_CONF="/etc/nginx/conf.d/active-upstream.conf"
# Determine which slot is currently active
CURRENT=$(cat "$UPSTREAM_CONF" | grep -oP ':\K[0-9]+')
if [ "$CURRENT" = "8001" ]; then
ACTIVE="blue"
TARGET="green"
TARGET_PORT="8002"
else
ACTIVE="green"
TARGET="blue"
TARGET_PORT="8001"
fi
echo "Active: $ACTIVE | Deploying to: $TARGET (port $TARGET_PORT)"
# Step 1: Build and start the target container
echo "Building and starting $TARGET..."
docker compose up -d --build "web-$TARGET"
# Step 2: Wait for health check
echo "Waiting for health check on port $TARGET_PORT..."
HEALTH_URL="${HEALTH_ENDPOINT/PORT/$TARGET_PORT}"
RETRIES=0
until curl -sf "$HEALTH_URL" > /dev/null 2>&1; do
RETRIES=$((RETRIES + 1))
if [ "$RETRIES" -ge "$MAX_RETRIES" ]; then
echo "ERROR: Health check failed after $MAX_RETRIES attempts"
echo "Rolling back: stopping $TARGET"
docker compose stop "web-$TARGET"
exit 1
fi
echo " Attempt $RETRIES/$MAX_RETRIES..."
sleep "$RETRY_INTERVAL"
done
echo "Health check passed!"
# Step 3: Switch Nginx to the new container
echo "Switching traffic to $TARGET..."
echo "server 127.0.0.1:$TARGET_PORT;" > "$UPSTREAM_CONF"
nginx -s reload
# Step 4: Drain connections from old container
echo "Draining connections from $ACTIVE ($DRAIN_WAIT seconds)..."
sleep "$DRAIN_WAIT"
# Step 5: Stop the old container
echo "Stopping $ACTIVE..."
docker compose stop "web-$ACTIVE"
echo "Deploy complete! Active slot: $TARGET"
Save this as deploy.sh, make it executable with chmod +x deploy.sh, and run it every time you push a new version.
This DIY approach works, but it has sharp edges you'll discover in production:
resolver directive and variables in proxy_pass to force re-resolution.process.on('SIGTERM') handler. Python needs signal trapping.docker image prune -f to the end of your deploy script.deploy.sh at the same time, you'll corrupt state. Add a lock file.Every one of these gotchas burns you exactly once. Then you add a fix, the script grows to 150 lines, and you've built yourself a deployment system to maintain forever. Which brings us to the question: is this script worth maintaining?
Database migrations are the trickiest part of zero-downtime deployment. Both GitHub's June 2025 outage and Cloudflare's November 2025 global outage were caused by database changes that cascaded into platform-wide failures. If your new code expects a column that doesn't exist yet — or your old code breaks when a column disappears — your rolling deployment fails.
The safe approach splits every breaking database change into three deploys:
Phase 1: Expand (backward-compatible)
-- Add the new column without removing the old one
ALTER TABLE users ADD COLUMN full_name TEXT;
-- Backfill data
UPDATE users SET full_name = first_name || ' ' || last_name;
Deploy code that writes to both columns but reads from the new one.
Phase 2: Migrate
Deploy code that only uses the new column. Both old and new application versions coexist safely because the old column still exists.
Phase 3: Contract (cleanup)
-- Safe to remove after all instances run the new version
ALTER TABLE users DROP COLUMN first_name;
ALTER TABLE users DROP COLUMN last_name;
During a blue-green or rolling deployment, both v1 and v2 run simultaneously. The expand-and-contract pattern ensures both versions work with the same database schema at every step. Never drop a column in the same deploy that stops using it.
Kubernetes handles zero-downtime deployment beautifully — rolling updates, readiness probes, and preStop hooks are all built in. But 82% of container users running Kubernetes doesn't mean 82% of them need it. For a single Docker app on one server, the overhead is significant.
What Kubernetes requires for zero-downtime deploys:
That's a lot of infrastructure for deploying a single application. Kubernetes shines when you're running dozens of services across multiple nodes. For one to five Docker apps on a single server? It's like hiring a crane to hang a picture frame.
The real trap is incremental complexity. You start with a simple deployment, add Kubernetes for rolling updates, then spend weeks learning about PodDisruptionBudgets, NetworkPolicies, and resource quotas. Each piece makes sense individually. Together, they form a system that requires dedicated infrastructure expertise to operate.
Temps runs a Pingora-based reverse proxy that handles health checks, connection draining, and atomic traffic switching out of the box. According to Splunk and Oxford Economics, unplanned downtime costs Global 2000 companies $400 billion per year, and most of that is preventable with proper deployment tooling.
Every git push triggers this pipeline automatically:
No deploy script. No Nginx config. No blue-green orchestration logic. The same three ingredients — health checks, draining, atomic switching — but managed by the platform instead of your bash scripts.
# That's the entire deploy workflow
git push temps main
If the new container fails health checks after three consecutive attempts, Temps rolls back automatically. The old version keeps running. Users never see the broken version. You get a notification with logs explaining what went wrong.
Remember the sharp edges from the DIY section? Here's how Temps handles each one:
The deploy script you'd maintain? Temps replaces it with a single binary that handles all of this, plus SSL certificates, log aggregation, and error tracking.
Don't assume a successful deploy means everything is fine. Watch these five metrics for at least 15 minutes after each deployment:
Whether you're using the DIY approach or a platform, a deploy that passes health checks can still have subtle issues: a slow query, a visual regression, an edge case in a new feature.
According to the Uptime Institute, 80% of operators believe their most recent downtime event was preventable. Most zero-downtime failures come from a handful of common mistakes — and they're all avoidable.
A health endpoint that always returns 200 defeats the purpose of health check gating. Always verify real dependencies:
// Bad: always returns healthy
app.get('/health', () => ({ status: 'ok' }));
// Good: verifies actual readiness
app.get('/health', async () => {
await db.query('SELECT 1');
await cache.ping();
return { status: 'ok' };
});
Never drop a column in the same deploy that stops using it. Always use the expand-and-contract pattern across multiple deploys. GitHub and Cloudflare both learned this lesson in 2025.
Small, frequent deploys are easier to roll back and less likely to cause cascading failures. The DORA data consistently shows that elite teams deploy multiple times per day — not once a week.
A deploy that passes health checks can still have subtle issues: a visual regression, a slow database query, an edge case in a new feature. Watch metrics for 15 minutes after every deployment.
If you can't roll back in seconds, you don't have zero-downtime deployment — you have zero-downtime deployment with a single point of failure. Always keep the previous container image cached and test your rollback process regularly.
Trust but verify. Don't assume your zero-downtime setup works — prove it with a load test. The 2024 DORA report introduced Deployment Rework Rate as a metric because teams often discover failures only after users report them. A load test during deployment catches problems before users do.
heyhey is a lightweight HTTP load generator. Install it and run continuous requests while deploying:
# Install hey
go install github.com/rakyll/hey@latest
# In terminal 1: start continuous load test
hey -z 120s -c 10 -q 50 http://myapp.com/health
This sends 50 requests per second from 10 concurrent workers for 120 seconds. Now, in another terminal, trigger your deploy:
# In terminal 2: deploy while load test is running
./deploy.sh # DIY approach
# or
git push temps main # Temps approach
When hey finishes, check the output:
Summary:
Total: 120.0034 secs
Slowest: 0.2345 secs
Fastest: 0.0012 secs
Average: 0.0089 secs
Requests/sec: 499.85
Status code distribution:
[200] 59982 responses
# Zero-downtime: CONFIRMED
# If you see ANY non-200 responses, something is wrong.
If the status code distribution shows only 200 responses, you've achieved zero-downtime deployment. Any 502, 503, or connection errors mean requests were dropped during the switch.
What to investigate if you see dropped requests:
DRAIN_WAIT in your deploy script.Run this test after every change to your deployment pipeline. What works in staging can break in production under different load patterns.
Deployment time depends primarily on your Docker image build step and application boot time. A typical Node.js or Python app builds in 30-90 seconds with layer caching. Health check verification adds 10-20 seconds. The traffic switch itself is instantaneous. Total time from push to live: usually under two minutes for most Docker applications.
The old container keeps running and serving all traffic — users experience nothing unusual. In the DIY approach, the deploy script stops the new container and exits with an error. With Temps, the platform automatically rolls back after three consecutive health check failures and sends a notification with container logs. Either way, the broken version never receives production traffic.
Yes. Blue-green deployment works on a single server by running two containers mapped to different ports, with Nginx routing to the active one. You need enough RAM and CPU for two instances of your app during the brief overlap period. For most web applications, that's an extra 256-512 MB of RAM for 30-60 seconds. No cluster or orchestrator required.
Zero-downtime deployment is the goal — ensuring users never see errors during a deploy. Blue-green is one strategy for achieving that goal. Rolling deployment and canary deployment are alternative strategies. Blue-green maintains two full environments and switches traffic atomically. Rolling replaces instances one by one. Both achieve zero downtime, but with different resource and complexity trade-offs.
Yes, but you must follow the expand-and-contract pattern. Never make breaking schema changes in a single deploy. Add new columns first, migrate code, then remove old columns in a separate deploy. This ensures both old and new application versions work with the same schema during the rolling update window.
The only requirement is a health check endpoint — an HTTP route that returns 200 when your app is ready to serve traffic. Most web frameworks make this trivial (a /health route that checks database connectivity). Beyond that, your app should handle SIGTERM for graceful shutdown. No other changes to your Dockerfile or application code are needed.
Zero-downtime deployment boils down to three principles: don't send traffic to unready containers, let in-flight requests finish, and switch routes atomically. You can implement these yourself with Docker Compose, Nginx, and a bash script. You'll spend a day building it and ongoing time maintaining it as edge cases surface.
Or you can skip the plumbing entirely. Temps wraps all three principles into a single binary — Pingora proxy, health check gating, connection draining, automatic rollback — so that every git push produces a zero-downtime deploy without scripts to maintain.
The DIY approach teaches you exactly what's happening. The platform approach lets you stop thinking about it. Both are valid. The worst option is accepting 502 errors during deploys as normal.
# Install Temps and get zero-downtime deploys by default
curl -fsSL temps.sh/install.sh | bash