Temps

36% of Teams Have Lost Data From Containers — Automate PostgreSQL Backups in Docker

March 12, 2026 (3mo ago)

Written by Temps Team

Last updated March 12, 2026 (3mo ago)

Back to all posts

Temps

36% of Teams Have Lost Data From Containers — Automate PostgreSQL Backups in Docker

March 12, 2026 (3mo ago)

Written by Temps Team

Last updated March 12, 2026 (3mo ago)

How to Back Up PostgreSQL in Docker Automatically

You're running PostgreSQL in Docker. Your data lives in a volume. That volume lives on a single disk, on a single machine. One hardware failure, one daemon crash, one accidental docker volume rm — and it's gone. No warning, no undo, no recovery.

Most Docker PostgreSQL setups have zero backup strategy. The volume is the database and the "backup" at the same time. That's not redundancy — it's a single point of failure dressed up as infrastructure.

Answer first: Automate PostgreSQL backups in Docker using a sidecar container running pg_dump on a cron schedule, compressing and uploading to S3-compatible storage. Add a retention policy and a weekly restore test. If you want this without building and maintaining the infrastructure yourself, Temps includes a built-in backup system — backup_schedules, configurable cron expressions, retention periods, and S3 upload — as part of its self-hosted deployment platform.

According to Percona's State of Open Source Databases survey, PostgreSQL is now the most popular database among developers, used by 48% of respondents. Yet a Kaseya/Unitrends survey found that 36% of organizations have experienced data loss from containers or cloud-native workloads in the past year. The database may be rock-solid. The container it's running in is not.

This guide covers why Docker volumes aren't backups, the three PostgreSQL backup strategies you should know, the sidecar pattern for automated dumps, and a step-by-step Docker Compose setup that runs backups on a schedule with retention and off-site upload.

TL;DR: Docker volumes are not backups — they're the same disk, same machine, same failure domain. Automate PostgreSQL backups with a sidecar container running pg_dump on a cron schedule, compressed and uploaded to S3-compatible storage. 36% of organizations report container-related data loss. This guide shows you how to set it up in under 30 minutes.

Temps vs DIY vs Managed Cloud: PostgreSQL Backup Comparison

Feature	Temps (self-hosted)	DIY sidecar	AWS RDS
Automated backup scheduling	Yes (cron expression)	Manual setup	Yes
WAL-G continuous archiving	Yes (preferred engine)	Manual pgBackRest/Barman	Yes (automated snapshots)
Off-site S3 upload	Yes (any S3-compatible)	Manual	Yes (S3, same account)
Retention policy	Configurable per schedule	Manual cron + find	1–35 days
One-click restore	Dashboard UI	Manual scripting	Console or CLI
Backup failure alerts	Yes (overdue + stalled)	Manual healthcheck	CloudWatch alarms
Per-database schedule scope	Yes (target all or specific DBs)	Manual	Per-instance only
S3 lifecycle rules enforced	Yes (auto-tags, lifecycle reconcile)	Manual bucket policy	Managed
Cost	~$6/mo on Temps Cloud (or self-host free)	Server cost only	See aws.amazon.com/rds/pricing
License	Apache 2.0	N/A	Proprietary

Why Aren't Docker Volumes Backups?

Docker volumes keep data on the host filesystem, but they share the same disk, same machine, and same failure domain as everything else. According to Backblaze's Drive Stats report, the annualized failure rate for hard drives in production was 1.7% across their fleet. One drive failure takes your volume and your database with it.

Same Disk, Same Machine

Your PostgreSQL container writes to a Docker volume. That volume is a directory on the host's filesystem — typically /var/lib/docker/volumes/. If the disk fails, both your running database and your "backup" volume are gone. There's no geographic separation, no redundancy, no fault isolation.

Even if you're running on a cloud VM with block storage, a single EBS or Cinder volume is still one device. Cloud providers replicate within their infrastructure, but accidental deletion, corruption from a bad write, or a misconfigured teardown script will destroy both your data and any snapshots stored on the same volume.

The `docker-compose down -v` Problem

This one command deletes every named volume in your Compose stack:

# This deletes ALL your data. Permanently.
docker-compose down -v

The -v flag removes volumes. It's one character. There's no confirmation prompt. In a 2023 post-mortem published by GitLab on their incident review process, accidental deletion of production data ranked among the top causes of extended outages — a category that includes rm -rf and volume teardown commands.

Every developer has muscle memory for docker-compose down. Adding -v is a habit that works fine in development and destroys production.

No Point-in-Time Recovery

A Docker volume gives you the current state of the database. That's it. If a bad migration ran two hours ago and corrupted your user table, the volume already contains the corrupted data. You can't rewind to the state before the migration.

Point-in-time recovery (PITR) requires continuous WAL archiving — a PostgreSQL feature that captures every write operation. Docker volumes don't enable this automatically. Without PITR, your recovery point objective (RPO) equals the time since your last manual backup, which for most Docker setups is "never."

What Are the PostgreSQL Backup Strategies?

PostgreSQL offers three backup methods, each with different trade-offs in speed, flexibility, and recovery granularity. The PostgreSQL documentation recommends combining logical backups for portability with continuous archiving for minimal data loss. Here's how they compare for Docker environments.

pg_dump: Logical Backups

pg_dump creates a logical representation of your database — SQL statements or a custom-format archive that can recreate tables, indexes, and data. It's the most portable option.

Strengths:

Works across PostgreSQL major versions (dump from 15, restore to 16)
Can back up individual databases or tables
Output is human-readable (in SQL format) or compressed (custom format)
Doesn't require filesystem access — connects over the network

Limitations:

Slower than physical backups for large databases
Creates a snapshot at one point in time — no continuous protection
Locks tables briefly during the dump (though --no-sync and parallel jobs help)

For databases under 100GB — which covers the vast majority of Docker-hosted PostgreSQL instances — pg_dump is fast enough and simple enough to be the default choice.

pg_basebackup: Physical Backups

pg_basebackup copies the entire data directory at the filesystem level. It's faster than pg_dump for large databases because it copies raw files instead of serializing SQL.

Strengths:

Significantly faster for databases over 50GB
Captures the complete cluster, including all databases and roles
Foundation for setting up streaming replication

Limitations:

Can only restore to the exact same PostgreSQL major version
Requires filesystem-level access or replication protocol
Produces larger backup files (raw data files vs. compressed SQL)

In Docker, pg_basebackup is trickier because you need access to the PostgreSQL data directory or a replication slot. For most container setups, the complexity isn't worth it unless your database exceeds 50–100GB.

WAL Archiving: Continuous Protection

Write-Ahead Log (WAL) archiving captures every change to the database in real time. Combined with a base backup, it enables PITR — restoring the database to any second in the past.

Strengths:

Near-zero RPO (you lose at most the last few seconds)
Can recover to any point in time, not just the last backup
Foundation for production-grade disaster recovery

Limitations:

More complex to configure and maintain
Requires continuous archive storage that grows with write volume
Restore process is slower and more involved

Tools like pgBackRest and Barman automate WAL archiving with compression, encryption, and S3 upload. They're the right choice for databases where losing even minutes of data is unacceptable.

Comparison Table

Method	Speed	Portability	RPO	Complexity	Best For
`pg_dump`	Moderate	Cross-version	Last backup interval	Low	Databases under 100GB
`pg_basebackup`	Fast	Same version only	Last backup interval	Medium	Large databases, replication setup
WAL archiving	Continuous	Same version + PITR	Seconds	High	Production databases needing PITR

For Docker setups, pg_dump on a schedule covers 90% of use cases. Start there. Add WAL archiving if your RPO requirement is under an hour.

What Is the Sidecar Pattern for Docker Backups?

The sidecar pattern runs a second container alongside your PostgreSQL container, connected via the same Docker network, with the sole job of performing and managing backups. According to Microsoft's cloud architecture patterns documentation, the sidecar pattern "deploys components of an application into a separate process or container to provide isolation and encapsulation."

Here's the architecture:

┌─────────────────────────────────────────────┐
│              Docker Network                  │
│                                             │
│  ┌──────────────┐    ┌──────────────────┐   │
│  │  PostgreSQL   │    │  Backup Sidecar   │   │
│  │  Container    │◄───│  Container        │   │
│  │              │    │                  │   │
│  │  Port 5432   │    │  - pg_dump       │   │
│  │  (internal)  │    │  - gzip          │   │
│  │              │    │  - S3 upload     │   │
│  │  Volume:     │    │  - cron schedule │   │
│  │  pgdata      │    │  - retention     │   │
│  └──────────────┘    └──────────────────┘   │
│                                             │
│  No ports exposed to host                   │
└─────────────────────────────────────────────┘
         │                      │
         ▼                      ▼
   Docker Volume          S3 / Object Storage
   (live data)            (backup files)

Why Not Back Up from the Same Container?

Running pg_dump inside the PostgreSQL container violates the one-process-per-container principle. If the backup process consumes too much CPU or memory, it starves the database. If the database crashes mid-backup, the dump is incomplete and there's no process left to detect the failure.

A sidecar keeps concerns separated. The PostgreSQL container runs the database. The sidecar handles backups. Each has its own resource limits, restart policy, and health checks.

This is the same approach Temps uses in production: PostgreSQL backup runs pg_dump inside a disposable sidecar container using the same image as the service, attached to the shared Docker network. This eliminates OOM kills (exit code 137) that occurred when pg_dumpall was exec'd inside the live service container.

Network-Only Access

The sidecar connects to PostgreSQL over the internal Docker network. No ports are exposed to the host. The connection string uses the service name:

postgresql://backup_user:password@postgres:5432/mydb

This is more secure than exposing port 5432 to the host or running SSH into the database container. The backup user can be granted read-only access with pg_read_all_data (PostgreSQL 14+), so even a compromised sidecar can't modify data.

Why would you expose your database port to the host when you don't have to?

How Do You Set Up Automated pg_dump with Docker Compose?

Setting up automated PostgreSQL backups requires four pieces: the Compose file defining both containers, a backup script, a cron schedule, and a retention policy. The 2024 Stack Overflow Developer Survey found PostgreSQL used by 49% of professional developers — making this the single most common database you'll find running in Docker.

Step 1: Docker Compose File

version: "3.8"

services:
  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: myapp
      POSTGRES_USER: myapp
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    volumes:
      - pgdata:/var/lib/postgresql/data
    networks:
      - backend
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U myapp"]
      interval: 10s
      timeout: 5s
      retries: 5

  backup:
    build:
      context: ./backup
      dockerfile: Dockerfile
    environment:
      PGHOST: postgres
      PGPORT: 5432
      PGUSER: myapp
      PGPASSWORD: ${POSTGRES_PASSWORD}
      PGDATABASE: myapp
      BACKUP_SCHEDULE: "0 */6 * * *"   # Every 6 hours
      BACKUP_RETENTION_DAYS: 30
      S3_BUCKET: ${S3_BUCKET}
      S3_ENDPOINT: ${S3_ENDPOINT}
      AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID}
      AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY}
    volumes:
      - backups:/backups
    networks:
      - backend
    depends_on:
      postgres:
        condition: service_healthy

volumes:
  pgdata:
  backups:

networks:
  backend:

Step 2: Backup Sidecar Dockerfile

FROM postgres:16-alpine

RUN apk add --no-cache \
    bash \
    curl \
    aws-cli \
    supercronic

COPY backup.sh /usr/local/bin/backup.sh
COPY entrypoint.sh /usr/local/bin/entrypoint.sh

RUN chmod +x /usr/local/bin/backup.sh /usr/local/bin/entrypoint.sh

ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]

We use the same postgres:16-alpine base image so that the pg_dump client version matches the server version exactly. Version mismatches between client and server are the number-one cause of failed dumps.

We've found that using supercronic instead of system cron eliminates the PID 1 and environment variable problems that plague cron in containers. It runs in the foreground, inherits all environment variables, and outputs to stdout — exactly what Docker expects.

Step 3: Entrypoint Script

#!/bin/bash
set -euo pipefail

# Generate crontab from environment variable
echo "${BACKUP_SCHEDULE} /usr/local/bin/backup.sh" > /etc/crontab

echo "Backup sidecar started"
echo "Schedule: ${BACKUP_SCHEDULE}"
echo "Retention: ${BACKUP_RETENTION_DAYS} days"
echo "Target: ${PGHOST}:${PGPORT}/${PGDATABASE}"

# Run supercronic with the generated crontab
exec supercronic /etc/crontab

Step 4: Backup Script

#!/bin/bash
set -euo pipefail

TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backups"
FILENAME="${PGDATABASE}_${TIMESTAMP}.sql.gz"
FILEPATH="${BACKUP_DIR}/${FILENAME}"

echo "=== Starting backup: ${FILENAME} ==="

# Run pg_dump with custom format for faster restore,
# or use plain SQL + gzip for portability
pg_dump \
  --no-owner \
  --no-privileges \
  --clean \
  --if-exists \
  --format=plain \
  | gzip > "${FILEPATH}"

FILESIZE=$(du -h "${FILEPATH}" | cut -f1)
echo "Backup created: ${FILENAME} (${FILESIZE})"

# Upload to S3-compatible storage
if [ -n "${S3_BUCKET:-}" ]; then
  aws s3 cp "${FILEPATH}" "s3://${S3_BUCKET}/pg-backups/${FILENAME}" \
    --endpoint-url "${S3_ENDPOINT}" \
    --quiet
  echo "Uploaded to s3://${S3_BUCKET}/pg-backups/${FILENAME}"
fi

# Apply retention policy -- delete local backups older than N days
find "${BACKUP_DIR}" -name "*.sql.gz" -mtime +${BACKUP_RETENTION_DAYS} -delete
REMAINING=$(find "${BACKUP_DIR}" -name "*.sql.gz" | wc -l)
echo "Retention applied: ${REMAINING} backups remaining locally"

echo "=== Backup complete ==="

Key Decisions in This Script

--no-owner --no-privileges — These flags make the backup portable. Without them, the restore will fail if the target database has different roles. This is the most common mistake in pg_dump workflows.

--clean --if-exists — These flags add DROP ... IF EXISTS statements before each CREATE, making the dump idempotent. You can restore over an existing database without manual cleanup.

Plain format + gzip vs. custom format — Plain SQL is human-readable and works with psql. Custom format (-Fc) supports parallel restore and selective table restore. For databases under 10GB, plain + gzip is simpler. Above that, switch to custom format.

S3-compatible storage — The aws CLI works with any S3-compatible service: AWS S3, Backblaze B2, Cloudflare R2, MinIO, Hetzner Object Storage. Set the endpoint URL and it just works.

How Do You Test Your Backups?

An untested backup isn't a backup — it's a file that might contain your data. According to Veeam's Data Protection Trends report, 58% of backup restores fail on the first attempt due to corruption, version mismatches, or configuration errors. Testing restores regularly is the only way to know your backups actually work.

Restore to a Test Container

Spin up a temporary PostgreSQL container, restore your latest backup, and verify the data:

#!/bin/bash
set -euo pipefail

LATEST_BACKUP=$(ls -t /backups/*.sql.gz | head -1)
echo "Testing restore of: ${LATEST_BACKUP}"

# Start a temporary PostgreSQL container
docker run -d \
  --name pg_restore_test \
  -e POSTGRES_DB=restore_test \
  -e POSTGRES_USER=test \
  -e POSTGRES_PASSWORD=test \
  postgres:16-alpine

# Wait for it to be ready
until docker exec pg_restore_test pg_isready -U test; do
  sleep 1
done

# Restore the backup
gunzip -c "${LATEST_BACKUP}" | \
  docker exec -i pg_restore_test \
  psql -U test -d restore_test --quiet

# Verify: check table counts
TABLE_COUNT=$(docker exec pg_restore_test \
  psql -U test -d restore_test -t -c \
  "SELECT count(*) FROM information_schema.tables WHERE table_schema = 'public'")

echo "Tables restored: ${TABLE_COUNT}"

# Verify: check row count of a critical table
ROW_COUNT=$(docker exec pg_restore_test \
  psql -U test -d restore_test -t -c \
  "SELECT count(*) FROM users" 2>/dev/null || echo "N/A")

echo "Users table rows: ${ROW_COUNT}"

# Cleanup
docker rm -f pg_restore_test
echo "Restore test complete"

Automate the Restore Test

Add a weekly restore test to your backup sidecar's crontab. If the restore fails, send an alert:

# In entrypoint.sh, add a weekly test schedule
echo "${BACKUP_SCHEDULE} /usr/local/bin/backup.sh" > /etc/crontab
echo "0 3 * * 0 /usr/local/bin/test-restore.sh" >> /etc/crontab

The test runs every Sunday at 3 AM. It restores the latest backup to a temporary container, runs verification queries, and exits with a non-zero code if anything fails. Your container monitoring picks up the failure.

Measure Your RTO

Recovery Time Objective (RTO) is how long it takes to restore from backup and resume serving traffic. Measure it:

Time the full restore process — download from S3, decompress, restore, verify
Add application startup time
Add DNS propagation if you're switching hosts

For a 5GB database on a decent VPS, expect roughly:

S3 download: 30–60 seconds
Decompression: 10–20 seconds
pg_dump restore: 2–5 minutes
Application startup: 30–60 seconds
Total RTO: approximately 5–8 minutes

Document this number. Share it with your team. If it's too high, consider WAL archiving for faster incremental recovery.

What Are the Most Common Backup Mistakes?

These mistakes show up repeatedly in post-mortems and forum threads. Every one of them has caused real data loss for real teams. Avoid all seven.

The pattern we've noticed across dozens of Docker PostgreSQL setups is that backup creation rarely fails. The failures happen at restore time — weeks or months later, when the backup turns out to be incomplete, incompatible, or unreachable. That's why testing matters more than scheduling.

Mistake 1: Backing Up to the Same Disk

Your backup file sits in a Docker volume on the same disk as your database. Disk dies, you lose both. This is the most common mistake and the easiest to fix — upload to S3 or any off-site storage after every dump.

Mistake 2: Never Testing Restores

You've been running pg_dump for six months. The backups are 0 bytes because PGPASSWORD wasn't set. Or they're from PostgreSQL 15 and you've upgraded to 16 with an incompatible custom-format dump. You don't find out until the moment you need the backup most.

Mistake 3: Missing `--no-owner`

Without --no-owner, the restore tries to set ownership to the original database roles. If those roles don't exist on the target, every statement produces a warning and some objects may not be created correctly. Always use --no-owner --no-privileges for portable backups.

Mistake 4: Forgetting PGPASSWORD

pg_dump prompts for a password interactively. In a cron job or Docker container, there's no TTY. The dump hangs silently or fails immediately. Set PGPASSWORD as an environment variable or use a .pgpass file. Always test that the backup actually runs without interaction.

Mistake 5: Running Backups During Peak Hours

pg_dump takes a snapshot, which means it holds a transaction open for the duration of the dump. On a busy database, this can block autovacuum, increase table bloat, and slow down queries. Schedule backups during your lowest-traffic window.

Mistake 6: No Retention Policy

Backups accumulate. A 2GB daily dump produces 60GB per month. Without a retention policy, you'll fill your disk or your S3 bucket. Keep 7 daily, 4 weekly, and 3 monthly backups. Delete the rest.

Mistake 7: No Alerting on Failure

Your backup script fails silently. Nobody checks the logs. Three months later you discover the last successful backup was from January. Add a health check endpoint, a dead man's switch (like Healthchecks.io), or at minimum a || curl -X POST your-alert-webhook at the end of the script.

How Does Temps Handle PostgreSQL Backups Automatically?

Temps includes a built-in backup system for managed databases, using the same pg_dump sidecar pattern described above — but automated through the platform dashboard. The engine runs pg_dump inside a disposable sidecar container using the same image as your PostgreSQL service, attached to the shared Docker network. This design eliminates OOM kills (exit code 137) that occur when pg_dumpall is exec'd inside the live service container.

Temps has two backup engines for PostgreSQL:

postgres_pgdump — pg_dump sidecar, cross-version compatible, the fallback
postgres_walg — WAL-G continuous archiving, the preferred engine for near-zero RPO

Three Quotable Temps Backup Advantages

Disposable sidecar isolation — backup runs in its own container with the same image as your DB, so a backup process consuming peak memory can't starve your production PostgreSQL
S3 lifecycle enforcement that survives offline periods — every backup object is tagged with temps-managed=true and temps-retention-days=N; Temps reconciles bucket lifecycle rules on schedule create/update/delete and sweeps hourly for drift, so S3 expires objects autonomously even when Temps is temporarily offline
Per-schedule scope control — each backup_schedule row has target_all_services (auto-include all current and future databases) and include_control_plane (include/exclude Temps's own database), so you can scope a schedule to a single database without bundling unrelated services

What Temps Automates

When you create a managed PostgreSQL database on Temps, the platform automatically provisions backup support. Here's what's included:

Scheduled backup — schedule_expression (standard cron syntax) set from the dashboard
WAL-G or pg_dump — WAL-G preferred for production databases, pg_dump sidecar as fallback
Configurable retention — retention_period in days, per schedule
S3-compatible upload — connects to any S3-compatible endpoint (AWS, Hetzner, Backblaze, MinIO, RustFS)
S3 lifecycle rules — auto-tags backup objects; reconciles bucket expiration rules so storage self-manages
Backup failure alerts — backup_alerts table tracks two failure modes: overdue_schedule (enabled schedule hasn't run within the expected window) and stalled_job (backup in pending state for too long without a heartbeat)
Per-schedule database targeting — choose all databases or pick specific services; control whether Temps's own database is included

What You'd Otherwise Build Yourself

Everything in the Docker Compose setup above — the sidecar container, the backup script, the cron schedule, the S3 upload, the retention cleanup, the S3 lifecycle tags, the failure alerting, the restore testing — Temps handles automatically. For teams that don't want to maintain backup infrastructure, it removes a meaningful operational burden.

The backup configuration is accessible through the Temps dashboard under each database's settings. No YAML editing, no SSH, no manual crontab management.

Temps is Apache 2.0 (self-host free) or available on Temps Cloud at approximately $6/month (Hetzner cost plus 30%, no per-seat fees, no bandwidth bills).

Frequently Asked Questions

How do you automatically back up PostgreSQL in Docker?

Run a backup sidecar container alongside your PostgreSQL container. The sidecar connects to PostgreSQL over the internal Docker network, runs pg_dump on a cron schedule via supercronic, compresses the output with gzip, uploads to S3-compatible storage, and applies a retention policy to delete old local backups. The entire setup fits in a single Docker Compose file and a bash script. If you prefer a managed approach, Temps automates this with a built-in backup scheduler, WAL-G support, and failure alerting.

How often should you back up PostgreSQL in Docker?

It depends on your Recovery Point Objective — how much data you can afford to lose. For most applications, every 6 hours provides a reasonable balance between storage costs and data safety. High-write applications should back up hourly or implement WAL archiving for continuous protection. The Veeam Data Protection Trends report found that 58% of restores fail on the first attempt, so frequency matters less than actually testing your restores.

Can you use pg_dump on a running PostgreSQL database?

Yes. pg_dump uses PostgreSQL's MVCC (Multi-Version Concurrency Control) to take a consistent snapshot without locking the database for writes. It holds an ACCESS SHARE lock, which doesn't block normal operations. For very large databases under heavy write load, schedule dumps during low-traffic periods to minimize the impact on autovacuum and replication lag.

What's the difference between pg_dump and pg_dumpall?

pg_dump backs up a single database. pg_dumpall backs up every database in the PostgreSQL cluster plus global objects like roles and tablespaces. In Docker, where you typically run one database per container, pg_dump is usually sufficient. Use pg_dumpall if your container hosts multiple databases or if you need to preserve role definitions across restores.

Should you use pg_dump custom format or plain SQL?

Plain SQL (--format=plain) produces human-readable output that restores with psql. Custom format (--format=custom) produces a compressed binary archive that restores with pg_restore, supports parallel restore jobs, and allows selective table restoration. For databases under 10GB, plain SQL with gzip is simpler. Above 10GB, custom format's parallel restore can reduce your RTO significantly — PostgreSQL documentation recommends using --jobs with custom format to match available CPU cores.

Is Temps PostgreSQL backup free?

Yes. Temps is Apache 2.0, so self-hosting is completely free. Temps Cloud is approximately $6/month (Hetzner cost plus 30% margin) with no per-seat fees and no bandwidth bills. Backup features — scheduled pg_dump, WAL-G archiving, S3 upload, retention management, and failure alerts — are included at no additional cost.

Wrapping Up

Docker volumes are not backups. They're storage. The distinction matters the moment something goes wrong — and with 1.7% annual drive failure rates, something eventually will.

The setup isn't complicated. A sidecar container running pg_dump on a cron schedule, compressing the output, uploading to off-site storage, and cleaning up old dumps covers the vast majority of PostgreSQL backup needs. The whole thing fits in a single Docker Compose file and a bash script.

What separates teams that recover from data loss and teams that don't isn't the backup tool. It's whether they tested their restores. Automate that too.

If you'd rather skip the infrastructure management entirely, Temps handles PostgreSQL backups automatically with built-in scheduling, WAL-G archiving, S3 lifecycle management, and failure alerting — Apache 2.0, self-host free, or approximately $6/month on Temps Cloud. But whether you build it yourself or use a platform, the important thing is to have backups running before you need them.

Back to all posts

How to Back Up PostgreSQL in Docker Automatically

TL;DR: Docker volumes are not backups — they're the same disk, same machine, same failure domain. Automate PostgreSQL backups with a sidecar container running pg_dump on a cron schedule, compressed and uploaded to S3-compatible storage. 36% of organizations report container-related data loss. This guide shows you how to set it up in under 30 minutes.

Temps vs DIY vs Managed Cloud: PostgreSQL Backup Comparison

Feature	Temps (self-hosted)	DIY sidecar	AWS RDS
Automated backup scheduling	Yes (cron expression)	Manual setup	Yes
WAL-G continuous archiving	Yes (preferred engine)	Manual pgBackRest/Barman	Yes (automated snapshots)
Off-site S3 upload	Yes (any S3-compatible)	Manual	Yes (S3, same account)
Retention policy	Configurable per schedule	Manual cron + find	1–35 days
One-click restore	Dashboard UI	Manual scripting	Console or CLI
Backup failure alerts	Yes (overdue + stalled)	Manual healthcheck	CloudWatch alarms
Per-database schedule scope	Yes (target all or specific DBs)	Manual	Per-instance only
S3 lifecycle rules enforced	Yes (auto-tags, lifecycle reconcile)	Manual bucket policy	Managed
Cost	~$6/mo on Temps Cloud (or self-host free)	Server cost only	See aws.amazon.com/rds/pricing
License	Apache 2.0	N/A	Proprietary

Why Aren't Docker Volumes Backups?

Same Disk, Same Machine

The `docker-compose down -v` Problem

This one command deletes every named volume in your Compose stack:

# This deletes ALL your data. Permanently.
docker-compose down -v

Every developer has muscle memory for docker-compose down. Adding -v is a habit that works fine in development and destroys production.

No Point-in-Time Recovery

What Are the PostgreSQL Backup Strategies?

pg_dump: Logical Backups

pg_dump creates a logical representation of your database — SQL statements or a custom-format archive that can recreate tables, indexes, and data. It's the most portable option.

Strengths:

Works across PostgreSQL major versions (dump from 15, restore to 16)
Can back up individual databases or tables
Output is human-readable (in SQL format) or compressed (custom format)
Doesn't require filesystem access — connects over the network

Limitations:

Slower than physical backups for large databases
Creates a snapshot at one point in time — no continuous protection
Locks tables briefly during the dump (though --no-sync and parallel jobs help)

For databases under 100GB — which covers the vast majority of Docker-hosted PostgreSQL instances — pg_dump is fast enough and simple enough to be the default choice.

pg_basebackup: Physical Backups

pg_basebackup copies the entire data directory at the filesystem level. It's faster than pg_dump for large databases because it copies raw files instead of serializing SQL.

Strengths:

Significantly faster for databases over 50GB
Captures the complete cluster, including all databases and roles
Foundation for setting up streaming replication

Limitations:

Can only restore to the exact same PostgreSQL major version
Requires filesystem-level access or replication protocol
Produces larger backup files (raw data files vs. compressed SQL)

WAL Archiving: Continuous Protection

Write-Ahead Log (WAL) archiving captures every change to the database in real time. Combined with a base backup, it enables PITR — restoring the database to any second in the past.

Strengths:

Near-zero RPO (you lose at most the last few seconds)
Can recover to any point in time, not just the last backup
Foundation for production-grade disaster recovery

Limitations:

More complex to configure and maintain
Requires continuous archive storage that grows with write volume
Restore process is slower and more involved

Tools like pgBackRest and Barman automate WAL archiving with compression, encryption, and S3 upload. They're the right choice for databases where losing even minutes of data is unacceptable.

Comparison Table

Method	Speed	Portability	RPO	Complexity	Best For
`pg_dump`	Moderate	Cross-version	Last backup interval	Low	Databases under 100GB
`pg_basebackup`	Fast	Same version only	Last backup interval	Medium	Large databases, replication setup
WAL archiving	Continuous	Same version + PITR	Seconds	High	Production databases needing PITR

For Docker setups, pg_dump on a schedule covers 90% of use cases. Start there. Add WAL archiving if your RPO requirement is under an hour.

What Is the Sidecar Pattern for Docker Backups?

Here's the architecture:

┌─────────────────────────────────────────────┐
│              Docker Network                  │
│                                             │
│  ┌──────────────┐    ┌──────────────────┐   │
│  │  PostgreSQL   │    │  Backup Sidecar   │   │
│  │  Container    │◄───│  Container        │   │
│  │              │    │                  │   │
│  │  Port 5432   │    │  - pg_dump       │   │
│  │  (internal)  │    │  - gzip          │   │
│  │              │    │  - S3 upload     │   │
│  │  Volume:     │    │  - cron schedule │   │
│  │  pgdata      │    │  - retention     │   │
│  └──────────────┘    └──────────────────┘   │
│                                             │
│  No ports exposed to host                   │
└─────────────────────────────────────────────┘
         │                      │
         ▼                      ▼
   Docker Volume          S3 / Object Storage
   (live data)            (backup files)

Why Not Back Up from the Same Container?

A sidecar keeps concerns separated. The PostgreSQL container runs the database. The sidecar handles backups. Each has its own resource limits, restart policy, and health checks.

Network-Only Access

The sidecar connects to PostgreSQL over the internal Docker network. No ports are exposed to the host. The connection string uses the service name:

postgresql://backup_user:password@postgres:5432/mydb

Why would you expose your database port to the host when you don't have to?

How Do You Set Up Automated pg_dump with Docker Compose?

Step 1: Docker Compose File

version: "3.8"

services:
  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: myapp
      POSTGRES_USER: myapp
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    volumes:
      - pgdata:/var/lib/postgresql/data
    networks:
      - backend
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U myapp"]
      interval: 10s
      timeout: 5s
      retries: 5

  backup:
    build:
      context: ./backup
      dockerfile: Dockerfile
    environment:
      PGHOST: postgres
      PGPORT: 5432
      PGUSER: myapp
      PGPASSWORD: ${POSTGRES_PASSWORD}
      PGDATABASE: myapp
      BACKUP_SCHEDULE: "0 */6 * * *"   # Every 6 hours
      BACKUP_RETENTION_DAYS: 30
      S3_BUCKET: ${S3_BUCKET}
      S3_ENDPOINT: ${S3_ENDPOINT}
      AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID}
      AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY}
    volumes:
      - backups:/backups
    networks:
      - backend
    depends_on:
      postgres:
        condition: service_healthy

volumes:
  pgdata:
  backups:

networks:
  backend:

Step 2: Backup Sidecar Dockerfile

FROM postgres:16-alpine

RUN apk add --no-cache \
    bash \
    curl \
    aws-cli \
    supercronic

COPY backup.sh /usr/local/bin/backup.sh
COPY entrypoint.sh /usr/local/bin/entrypoint.sh

RUN chmod +x /usr/local/bin/backup.sh /usr/local/bin/entrypoint.sh

ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]

Step 3: Entrypoint Script

#!/bin/bash
set -euo pipefail

# Generate crontab from environment variable
echo "${BACKUP_SCHEDULE} /usr/local/bin/backup.sh" > /etc/crontab

echo "Backup sidecar started"
echo "Schedule: ${BACKUP_SCHEDULE}"
echo "Retention: ${BACKUP_RETENTION_DAYS} days"
echo "Target: ${PGHOST}:${PGPORT}/${PGDATABASE}"

# Run supercronic with the generated crontab
exec supercronic /etc/crontab

Step 4: Backup Script

#!/bin/bash
set -euo pipefail

TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backups"
FILENAME="${PGDATABASE}_${TIMESTAMP}.sql.gz"
FILEPATH="${BACKUP_DIR}/${FILENAME}"

echo "=== Starting backup: ${FILENAME} ==="

# Run pg_dump with custom format for faster restore,
# or use plain SQL + gzip for portability
pg_dump \
  --no-owner \
  --no-privileges \
  --clean \
  --if-exists \
  --format=plain \
  | gzip > "${FILEPATH}"

FILESIZE=$(du -h "${FILEPATH}" | cut -f1)
echo "Backup created: ${FILENAME} (${FILESIZE})"

# Upload to S3-compatible storage
if [ -n "${S3_BUCKET:-}" ]; then
  aws s3 cp "${FILEPATH}" "s3://${S3_BUCKET}/pg-backups/${FILENAME}" \
    --endpoint-url "${S3_ENDPOINT}" \
    --quiet
  echo "Uploaded to s3://${S3_BUCKET}/pg-backups/${FILENAME}"
fi

# Apply retention policy -- delete local backups older than N days
find "${BACKUP_DIR}" -name "*.sql.gz" -mtime +${BACKUP_RETENTION_DAYS} -delete
REMAINING=$(find "${BACKUP_DIR}" -name "*.sql.gz" | wc -l)
echo "Retention applied: ${REMAINING} backups remaining locally"

echo "=== Backup complete ==="

Key Decisions in This Script

--clean --if-exists — These flags add DROP ... IF EXISTS statements before each CREATE, making the dump idempotent. You can restore over an existing database without manual cleanup.

S3-compatible storage — The aws CLI works with any S3-compatible service: AWS S3, Backblaze B2, Cloudflare R2, MinIO, Hetzner Object Storage. Set the endpoint URL and it just works.

How Do You Test Your Backups?

Restore to a Test Container

Spin up a temporary PostgreSQL container, restore your latest backup, and verify the data:

#!/bin/bash
set -euo pipefail

LATEST_BACKUP=$(ls -t /backups/*.sql.gz | head -1)
echo "Testing restore of: ${LATEST_BACKUP}"

# Start a temporary PostgreSQL container
docker run -d \
  --name pg_restore_test \
  -e POSTGRES_DB=restore_test \
  -e POSTGRES_USER=test \
  -e POSTGRES_PASSWORD=test \
  postgres:16-alpine

# Wait for it to be ready
until docker exec pg_restore_test pg_isready -U test; do
  sleep 1
done

# Restore the backup
gunzip -c "${LATEST_BACKUP}" | \
  docker exec -i pg_restore_test \
  psql -U test -d restore_test --quiet

# Verify: check table counts
TABLE_COUNT=$(docker exec pg_restore_test \
  psql -U test -d restore_test -t -c \
  "SELECT count(*) FROM information_schema.tables WHERE table_schema = 'public'")

echo "Tables restored: ${TABLE_COUNT}"

# Verify: check row count of a critical table
ROW_COUNT=$(docker exec pg_restore_test \
  psql -U test -d restore_test -t -c \
  "SELECT count(*) FROM users" 2>/dev/null || echo "N/A")

echo "Users table rows: ${ROW_COUNT}"

# Cleanup
docker rm -f pg_restore_test
echo "Restore test complete"

Automate the Restore Test

Add a weekly restore test to your backup sidecar's crontab. If the restore fails, send an alert:

# In entrypoint.sh, add a weekly test schedule
echo "${BACKUP_SCHEDULE} /usr/local/bin/backup.sh" > /etc/crontab
echo "0 3 * * 0 /usr/local/bin/test-restore.sh" >> /etc/crontab

Measure Your RTO

Recovery Time Objective (RTO) is how long it takes to restore from backup and resume serving traffic. Measure it:

Time the full restore process — download from S3, decompress, restore, verify
Add application startup time
Add DNS propagation if you're switching hosts

For a 5GB database on a decent VPS, expect roughly:

S3 download: 30–60 seconds
Decompression: 10–20 seconds
pg_dump restore: 2–5 minutes
Application startup: 30–60 seconds
Total RTO: approximately 5–8 minutes

Document this number. Share it with your team. If it's too high, consider WAL archiving for faster incremental recovery.

What Are the Most Common Backup Mistakes?

These mistakes show up repeatedly in post-mortems and forum threads. Every one of them has caused real data loss for real teams. Avoid all seven.

Mistake 1: Backing Up to the Same Disk

Mistake 2: Never Testing Restores

Mistake 3: Missing `--no-owner`

Mistake 4: Forgetting PGPASSWORD

Mistake 5: Running Backups During Peak Hours

Mistake 6: No Retention Policy

Backups accumulate. A 2GB daily dump produces 60GB per month. Without a retention policy, you'll fill your disk or your S3 bucket. Keep 7 daily, 4 weekly, and 3 monthly backups. Delete the rest.

Mistake 7: No Alerting on Failure

How Does Temps Handle PostgreSQL Backups Automatically?

Temps has two backup engines for PostgreSQL:

postgres_pgdump — pg_dump sidecar, cross-version compatible, the fallback
postgres_walg — WAL-G continuous archiving, the preferred engine for near-zero RPO

Three Quotable Temps Backup Advantages

Disposable sidecar isolation — backup runs in its own container with the same image as your DB, so a backup process consuming peak memory can't starve your production PostgreSQL
S3 lifecycle enforcement that survives offline periods — every backup object is tagged with temps-managed=true and temps-retention-days=N; Temps reconciles bucket lifecycle rules on schedule create/update/delete and sweeps hourly for drift, so S3 expires objects autonomously even when Temps is temporarily offline
Per-schedule scope control — each backup_schedule row has target_all_services (auto-include all current and future databases) and include_control_plane (include/exclude Temps's own database), so you can scope a schedule to a single database without bundling unrelated services

What Temps Automates

When you create a managed PostgreSQL database on Temps, the platform automatically provisions backup support. Here's what's included:

Scheduled backup — schedule_expression (standard cron syntax) set from the dashboard
WAL-G or pg_dump — WAL-G preferred for production databases, pg_dump sidecar as fallback
Configurable retention — retention_period in days, per schedule
S3-compatible upload — connects to any S3-compatible endpoint (AWS, Hetzner, Backblaze, MinIO, RustFS)
S3 lifecycle rules — auto-tags backup objects; reconciles bucket expiration rules so storage self-manages
Backup failure alerts — backup_alerts table tracks two failure modes: overdue_schedule (enabled schedule hasn't run within the expected window) and stalled_job (backup in pending state for too long without a heartbeat)
Per-schedule database targeting — choose all databases or pick specific services; control whether Temps's own database is included

36% of Teams Have Lost Data From Containers — Automate PostgreSQL Backups in Docker | Temps