How to Add Zero-Downtime Deployments to Any Docker App
How to Add Zero-Downtime Deployments to Any Docker App
March 12, 2026 (2 days ago)
Written by Temps Team
Last updated March 12, 2026 (2 days ago)
You push a new version and for five to thirty seconds, some users see errors. In-flight requests drop. WebSocket connections break. It happens every deploy, and most teams just accept it.
Here's the thing: zero-downtime deployment is achievable without Kubernetes. You don't need a container orchestration platform, a service mesh, or a dedicated SRE team. You need three things — health checks, connection draining, and an atomic route switch — layered on top of Docker containers you're already running.
This guide walks through why Docker deployments have downtime by default, how to build a zero-downtime pipeline from scratch with Docker Compose and Nginx, and how to skip the DIY work entirely if you'd rather not maintain deployment scripts forever.
[INTERNAL-LINK: zero-downtime deployment concepts -> /blog/zero-downtime-deployments-temps]
TL;DR: Docker's default stop-start cycle creates a 5-30 second gap where requests fail. You can eliminate it with health check gating, connection draining, and blue-green container swaps. Elite engineering teams deploy multiple times per day with a 5% change failure rate (DORA, 2024). This guide shows both the DIY approach and a one-command alternative.
Why Do Docker Deployments Have Downtime by Default?
Docker's default lifecycle creates an unavoidable gap between stopping the old container and starting the new one. 91% of mid-size and large enterprises report that a single hour of downtime costs over $300,000 (ITIC, 2024). Even brief deployment windows — repeated across multiple daily deploys — compound fast.
Citation capsule: Docker's stop-start deployment model creates 5-30 seconds of downtime per deploy. 91% of mid-size and large enterprises report hourly downtime costs exceeding $300,000 (ITIC, 2024). Health check gating and connection draining eliminate this gap entirely.
The Stop-Start Gap
When you run docker compose up -d --build, Docker stops the old container, removes it, builds the new image, and starts a fresh container. That sequence has three gaps where requests fail:
- Container shutdown — The old process receives SIGTERM. If your app doesn't handle graceful shutdown, connections drop immediately.
- Image build — Even with layer caching, builds take seconds to minutes. No container is running during this window.
- Application boot — The new container starts, but your app needs time to load config, establish database connections, and warm caches.
During these gaps, any request hitting your server gets a 502 Bad Gateway or a connection refused error. That's downtime.
Why restart: always Doesn't Help
A common misconception: setting restart: always in your docker-compose.yml gives you zero-downtime deploys. It doesn't. This directive tells Docker to restart the same container when it crashes. It doesn't spin up a new version alongside the old one.
# This does NOT give you zero-downtime deployment
services:
web:
image: myapp:latest
restart: always # Only restarts the SAME container on crash
What you actually need is two containers running simultaneously — the old version serving traffic while the new version boots up and passes health checks. That's a fundamentally different pattern.
Why docker compose up -d Replaces In-Place
Running docker compose up -d with an updated image does a stop-then-start on the same service. It doesn't create a parallel instance. Even docker compose up -d --scale web=2 won't orchestrate a graceful handoff. You'd end up with two containers behind no load balancer, both receiving traffic with no health gating.
The core problem: Docker Compose is a development tool that happens to work in production. It wasn't designed for zero-downtime deployments. You need to layer your own orchestration on top.
[INTERNAL-LINK: Docker deployment basics -> /docs/deployments]
What's the Difference Between Blue-Green, Rolling, and Canary Deployments?
Three strategies dominate zero-downtime deployment, each with different trade-offs in complexity, cost, and risk. The 2024 DORA report found that elite-performing teams deploy on-demand with a change failure rate of just 5%, while low performers deploy monthly with a 64% failure rate (DORA / Google, 2024). The strategy you pick affects how fast you can recover from that failure.
Citation capsule: Blue-green, rolling, and canary are the three primary zero-downtime deployment strategies. Elite teams maintain a 5% change failure rate while deploying multiple times per day (DORA / Google, 2024). Blue-green doubles resources but offers instant rollback. Rolling is the most resource-efficient.
Blue-Green Deployment
Blue-green keeps two identical environments running. "Blue" serves production traffic. "Green" gets the new version. Once green passes health checks, the load balancer switches all traffic in one atomic step.
How it works:
- Deploy new version to the idle environment (green)
- Run health checks against green
- Switch the load balancer from blue to green
- Blue becomes the idle environment (instant rollback target)
The advantage is simplicity: one clean switch, one clean rollback. The downside is cost — you're running two full environments at all times. For a single-server Docker setup, this means doubling your container resources permanently.
Rolling Deployment
Rolling updates replace instances one at a time. The old version keeps serving while new instances spin up and pass health checks.
Time 0: [v1] [v1] [v1] ← All running v1
Time 1: [v1] [v1] [v2...] ← One instance boots v2
Time 2: [v1] [v1] [v2 ✓] ← v2 passes check, takes traffic
Time 3: [v1] [v2 ✓] [v2 ✓] ← Second instance upgraded
Time 4: [v2 ✓] [v2 ✓] [v2 ✓] ← Complete, zero dropped requests
Rolling deployment uses about 1.3x resources during the deploy — not 2x permanently. But it requires multiple instances, which makes it less practical on a single server with one container.
Canary Deployment
Canary sends a small percentage of traffic (say 5%) to the new version first. If error rates stay flat, traffic gradually shifts — 5%, 25%, 50%, 100%. If anything goes wrong, only that small slice of users was affected.
This is the safest approach for high-traffic applications. But it's also the most complex to implement. You need traffic splitting at the load balancer level, per-version metrics collection, and automated promotion logic.
Which Strategy Should You Use?
| Strategy | Resource Cost | Rollback Speed | Complexity | Best For |
|---|---|---|---|---|
| Blue-green | 2x always | Instant | Low | Single-server Docker apps |
| Rolling | 1.3x during deploy | Seconds | Medium | Multi-instance clusters |
| Canary | 1.1x during deploy | Seconds | High | High-traffic production |
For most Docker apps running on a single server, blue-green is the practical choice. You can implement it with two containers and an Nginx reload. No cluster required.
What Are the Three Ingredients for Zero-Downtime Docker Deployment?
Every zero-downtime strategy relies on three mechanisms working together: health check gating, connection draining, and atomic routing. Organizations with full-stack observability experience 71% fewer annual outages (New Relic, 2024). These three ingredients are the foundation of that observability at the deployment layer.
Citation capsule: Zero-downtime Docker deployment requires three mechanisms: health check gating prevents traffic to unready containers, connection draining lets in-flight requests finish, and atomic routing switches traffic in one step. Organizations with proper observability see 71% fewer outages (New Relic, 2024).
Health Check Gating
The new container should never receive traffic until it's genuinely ready. A health check endpoint verifies that your application booted, connected to its database, and can serve requests.
// Express.js health check that verifies real dependencies
app.get('/health', async (req, res) => {
try {
await db.query('SELECT 1');
await redis.ping();
res.status(200).json({ status: 'healthy' });
} catch (err) {
res.status(503).json({ status: 'unhealthy', error: err.message });
}
});
A health endpoint that blindly returns 200 defeats the entire purpose. If your app returns "healthy" before the database connection pool is established, the load balancer will route traffic to a container that immediately throws 500 errors.
Docker's built-in HEALTHCHECK instruction helps, but it's not enough on its own. Docker health checks only affect container status — they don't control your load balancer. You need your deployment script to check health before switching traffic.
Connection Draining
When you remove the old container from the load balancer, don't kill it immediately. In-flight requests — a user mid-checkout, a file upload at 90%, a long-polling connection — need time to complete.
Connection draining means:
- Stop sending new requests to the old container
- Let existing requests finish (with a timeout, typically 30 seconds)
- Kill the container only after all connections close or the timeout expires
Without draining, you'll randomly drop requests during every deploy. Users won't see a full outage, but they'll get sporadic 502 errors that are hard to reproduce and diagnose.
Atomic Route Switch
The load balancer needs to flip from old to new in one step. Not gradually, not with a gap — atomically. For Nginx, this is a config reload:
nginx -s reload
Nginx's reload is graceful: it starts new worker processes with the updated config, and old worker processes finish their current requests before exiting. That's atomic routing and connection draining in one operation — but only for the Nginx layer. Your application containers still need their own draining logic.
How Do You Build Zero-Downtime Deployment with Docker Compose and Nginx?
Here's a working blue-green deployment pipeline you can implement today on any single server. 82% of container users now run Kubernetes in production (CNCF, 2025), but you don't need to be one of them. This approach uses Docker Compose, Nginx, and a 60-line bash script.
[ORIGINAL DATA]
Citation capsule: A blue-green deployment pipeline with Docker Compose and Nginx requires about 60 lines of bash scripting. 82% of container users run Kubernetes (CNCF, 2025), but a single-server Docker setup can achieve identical zero-downtime results without cluster orchestration.
Step 1: Docker Compose with Two Service Slots
Define two services — web-blue and web-green — so both can run simultaneously during the transition.
# docker-compose.yml
services:
web-blue:
build: .
container_name: app-blue
ports:
- "8001:3000"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 5s
timeout: 3s
retries: 3
start_period: 10s
restart: unless-stopped
web-green:
build: .
container_name: app-green
ports:
- "8002:3000"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 5s
timeout: 3s
retries: 3
start_period: 10s
restart: unless-stopped
Port 8001 maps to the blue container. Port 8002 maps to green. Nginx sits in front and routes to whichever is currently active.
Step 2: Nginx Upstream Configuration
Create two upstream configs that Nginx can switch between:
# /etc/nginx/conf.d/app.conf
upstream app_backend {
# This file gets overwritten by the deploy script
include /etc/nginx/conf.d/active-upstream.conf;
}
server {
listen 80;
server_name myapp.com;
location / {
proxy_pass http://app_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_connect_timeout 5s;
proxy_read_timeout 30s;
}
}
# /etc/nginx/conf.d/active-upstream.conf
# Points to blue by default
server 127.0.0.1:8001;
Step 3: The Deploy Script
This is where the magic happens. The script determines which slot is active, deploys to the idle slot, waits for health checks, switches Nginx, and drains the old container.
#!/bin/bash
set -euo pipefail
# Configuration
HEALTH_ENDPOINT="http://localhost:PORT/health"
MAX_RETRIES=30
RETRY_INTERVAL=2
DRAIN_WAIT=10
UPSTREAM_CONF="/etc/nginx/conf.d/active-upstream.conf"
# Determine which slot is currently active
CURRENT=$(cat "$UPSTREAM_CONF" | grep -oP ':\K[0-9]+')
if [ "$CURRENT" = "8001" ]; then
ACTIVE="blue"
TARGET="green"
TARGET_PORT="8002"
else
ACTIVE="green"
TARGET="blue"
TARGET_PORT="8001"
fi
echo "Active: $ACTIVE | Deploying to: $TARGET (port $TARGET_PORT)"
# Step 1: Build and start the target container
echo "Building and starting $TARGET..."
docker compose up -d --build "web-$TARGET"
# Step 2: Wait for health check
echo "Waiting for health check on port $TARGET_PORT..."
HEALTH_URL="${HEALTH_ENDPOINT/PORT/$TARGET_PORT}"
RETRIES=0
until curl -sf "$HEALTH_URL" > /dev/null 2>&1; do
RETRIES=$((RETRIES + 1))
if [ "$RETRIES" -ge "$MAX_RETRIES" ]; then
echo "ERROR: Health check failed after $MAX_RETRIES attempts"
echo "Rolling back: stopping $TARGET"
docker compose stop "web-$TARGET"
exit 1
fi
echo " Attempt $RETRIES/$MAX_RETRIES..."
sleep "$RETRY_INTERVAL"
done
echo "Health check passed!"
# Step 3: Switch Nginx to the new container
echo "Switching traffic to $TARGET..."
echo "server 127.0.0.1:$TARGET_PORT;" > "$UPSTREAM_CONF"
nginx -s reload
# Step 4: Drain connections from old container
echo "Draining connections from $ACTIVE ($DRAIN_WAIT seconds)..."
sleep "$DRAIN_WAIT"
# Step 5: Stop the old container
echo "Stopping $ACTIVE..."
docker compose stop "web-$ACTIVE"
echo "Deploy complete! Active slot: $TARGET"
Save this as deploy.sh, make it executable with chmod +x deploy.sh, and run it every time you push a new version.
The Gotchas You'll Hit
This DIY approach works, but it has sharp edges you'll discover in production:
- DNS caching — If you're using a DNS-based load balancer upstream, Nginx caches DNS resolution at startup. You'll need a
resolverdirective and variables inproxy_passto force re-resolution. - Connection pools — Database connection pools in the old container may hold connections open past the drain timeout. Set your app's shutdown handler to close pools explicitly.
- SIGTERM handling — Many frameworks don't handle SIGTERM gracefully by default. Node.js needs an explicit
process.on('SIGTERM')handler. Python needs signal trapping. - Disk space — Old Docker images pile up. Add
docker image prune -fto the end of your deploy script. - Concurrent deploys — If two people run
deploy.shat the same time, you'll corrupt state. Add a lock file.
[PERSONAL EXPERIENCE]
Every one of these gotchas burns you exactly once. Then you add a fix, the script grows to 150 lines, and you've built yourself a deployment system to maintain forever. Which brings us to the question: is this script worth maintaining?
[INTERNAL-LINK: deployment automation -> /docs/cli]
Why Is Kubernetes Overkill for Most Docker Apps?
Kubernetes handles zero-downtime deployment beautifully — rolling updates, readiness probes, and preStop hooks are all built in. But 82% of container users running Kubernetes (CNCF, 2025) doesn't mean 82% of them need it. For a single Docker app on one server, the overhead is significant.
What Kubernetes requires for zero-downtime deploys:
- Cluster setup — At minimum, a control plane node and a worker node. Managed Kubernetes (EKS, GKE, AKS) costs $70-150/month just for the control plane.
- Deployment manifests — YAML files defining replicas, update strategy, readiness probes, resource limits.
- Ingress controller — Nginx Ingress, Traefik, or similar. Another component to configure and maintain.
- Cert-manager — For automatic HTTPS. More YAML, more CRDs, more things to break.
- kubectl and CI/CD — Your team needs to learn Kubernetes tooling and integrate it into your pipeline.
That's a lot of infrastructure for deploying a single application. Kubernetes shines when you're running dozens of services across multiple nodes. For one to five Docker apps on a single server? It's like hiring a crane to hang a picture frame.
[UNIQUE INSIGHT]
The real trap is incremental complexity. You start with a simple deployment, add Kubernetes for rolling updates, then spend weeks learning about PodDisruptionBudgets, NetworkPolicies, and resource quotas. Each piece makes sense individually. Together, they form a system that requires dedicated infrastructure expertise to operate.
How Does Temps Handle Zero-Downtime Deployment Automatically?
Temps runs a Pingora-based reverse proxy that handles health checks, connection draining, and atomic traffic switching out of the box. Unplanned downtime costs Global 2000 companies $400 billion per year (Splunk / Oxford Economics, 2024), and most of that is preventable with proper deployment tooling.
Citation capsule: Temps uses a Pingora-based reverse proxy for automatic zero-downtime Docker deployment — health check gating, blue-green container swaps, and connection draining without custom scripts. Unplanned downtime costs Global 2000 companies $400 billion annually (Splunk / Oxford Economics, 2024).
Every git push triggers this pipeline automatically:
- Build — New Docker image builds while the old container keeps serving traffic
- Health check — New container starts, Temps polls the health endpoint until it returns 200
- Traffic shift — Pingora atomically routes new requests to the new container
- Drain — Old container finishes in-flight requests before shutdown
- Cleanup — Old container and dangling images are removed
No deploy script. No Nginx config. No blue-green orchestration logic. The same three ingredients — health checks, draining, atomic switching — but managed by the platform instead of your bash scripts.
# That's the entire deploy workflow
git push temps main
If the new container fails health checks after three consecutive attempts, Temps rolls back automatically. The old version keeps running. Users never see the broken version. You get a notification with logs explaining what went wrong.
What About the Gotchas?
Remember the sharp edges from the DIY section? Here's how Temps handles each one:
- DNS caching — Pingora resolves upstreams dynamically. No stale DNS.
- Connection pools — Drain timeout is configurable per project. Defaults to 30 seconds.
- SIGTERM handling — Temps sends SIGTERM, waits for the drain timeout, then sends SIGKILL. Your app gets a fair chance to shut down cleanly.
- Disk space — Old images are pruned automatically after successful deploys.
- Concurrent deploys — Deploy queue ensures deploys run sequentially per project.
The deploy script you'd maintain? Temps replaces it with a single binary that handles all of this, plus SSL certificates, log aggregation, and error tracking.
[INTERNAL-LINK: Temps vs Coolify comparison -> /blog/temps-vs-coolify-vs-netlify]
How Do You Verify Zero Requests Are Dropped During Deployment?
Trust but verify. Don't assume your zero-downtime setup works — prove it with a load test. The 2024 DORA report introduced Deployment Rework Rate as a metric because teams often discover failures only after users report them (DORA, 2024). A load test during deployment catches problems before users do.
Citation capsule: To verify zero-downtime deployment, run continuous load testing during a deploy and check for non-200 responses. The DORA 2024 report introduced Deployment Rework Rate to track post-deploy failures (DORA, 2024). Zero non-200 responses during a load test confirms true zero-downtime.
Load Test with hey
hey is a lightweight HTTP load generator. Install it and run continuous requests while deploying:
# Install hey
go install github.com/rakyll/hey@latest
# In terminal 1: start continuous load test
hey -z 120s -c 10 -q 50 http://myapp.com/health
This sends 50 requests per second from 10 concurrent workers for 120 seconds. Now, in another terminal, trigger your deploy:
# In terminal 2: deploy while load test is running
./deploy.sh # DIY approach
# or
git push temps main # Temps approach
Reading the Results
When hey finishes, check the output:
Summary:
Total: 120.0034 secs
Slowest: 0.2345 secs
Fastest: 0.0012 secs
Average: 0.0089 secs
Requests/sec: 499.85
Status code distribution:
[200] 59982 responses
# Zero-downtime: CONFIRMED
# If you see ANY non-200 responses, something is wrong.
If the status code distribution shows only 200 responses, you've achieved zero-downtime deployment. Any 502, 503, or connection errors mean requests were dropped during the switch.
Common Load Test Failures
What to investigate if you see dropped requests:
- A few 502s at the switch point — Nginx reload didn't drain cleanly. Increase
DRAIN_WAITin your deploy script. - Burst of 503s early in the deploy — Health check passed too early. Your app reported healthy before it was ready. Make your health endpoint check real dependencies.
- Timeout errors — Drain timeout is too short. Long-running requests were killed before completing.
Run this test after every change to your deployment pipeline. What works in staging can break in production under different load patterns.
Frequently Asked Questions
How long does a zero-downtime Docker deployment take?
Deployment time depends primarily on your Docker image build step and application boot time. A typical Node.js or Python app builds in 30-90 seconds with layer caching. Health check verification adds 10-20 seconds. The traffic switch itself is instantaneous. Total time from push to live: usually under two minutes for most Docker applications.
What happens if the new version fails health checks?
The old container keeps running and serving all traffic — users experience nothing unusual. In the DIY approach, the deploy script stops the new container and exits with an error. With Temps, the platform automatically rolls back after three consecutive health check failures and sends a notification with container logs. Either way, the broken version never receives production traffic.
Can I do zero-downtime deployments on a single server?
Yes. Blue-green deployment works on a single server by running two containers mapped to different ports, with Nginx routing to the active one. You need enough RAM and CPU for two instances of your app during the brief overlap period. For most web applications, that's an extra 256-512 MB of RAM for 30-60 seconds. No cluster or orchestrator required.
What's the difference between zero-downtime and blue-green deployment?
Zero-downtime deployment is the goal — ensuring users never see errors during a deploy. Blue-green is one strategy for achieving that goal. Rolling deployment and canary deployment are alternative strategies. Blue-green maintains two full environments and switches traffic atomically. Rolling replaces instances one by one. Both achieve zero downtime, but with different resource and complexity trade-offs.
Do I need to change my Docker image for zero-downtime deploys?
The only requirement is a health check endpoint — an HTTP route that returns 200 when your app is ready to serve traffic. Most web frameworks make this trivial (a /health route that checks database connectivity). Beyond that, your app should handle SIGTERM for graceful shutdown. No other changes to your Dockerfile or application code are needed.
What's Next?
Zero-downtime deployment boils down to three principles: don't send traffic to unready containers, let in-flight requests finish, and switch routes atomically. You can implement these yourself with Docker Compose, Nginx, and a bash script. You'll spend a day building it and ongoing time maintaining it as edge cases surface.
Or you can skip the plumbing entirely. Temps wraps all three principles into a single binary — Pingora proxy, health check gating, connection draining, automatic rollback — so that every git push produces a zero-downtime deploy without scripts to maintain.
The DIY approach teaches you exactly what's happening. The platform approach lets you stop thinking about it. Both are valid. The worst option is accepting 502 errors during deploys as normal.
# Install Temps and get zero-downtime deploys by default
curl -fsSL temps.sh/install.sh | bash
[INTERNAL-LINK: getting started guide -> /docs/getting-started] [INTERNAL-LINK: migrating from Vercel -> /blog/migrate-from-vercel-to-self-hosted]