How to Set Up Cron Jobs That Actually Work in Production Containers
How to Set Up Cron Jobs That Actually Work in Production Containers
March 12, 2026 (2 days ago)
Written by Temps Team
Last updated March 12, 2026 (2 days ago)
Cron is a solved problem. It's been reliably scheduling tasks since 1975. Then containers happened.
You drop cron into a Docker image and everything looks fine — until it doesn't. Your cleanup job silently stopped running three weeks ago. Your billing reconciliation hasn't fired since the last deploy. Nobody noticed because cron's output went to /dev/null, there were no alerts, and the container's health check doesn't know cron exists.
According to a Datadog survey (2024), over 65% of production containers now run in orchestrated environments where traditional cron simply doesn't work as expected. The one-process-per-container model that Docker recommends breaks cron's fundamental assumptions.
This guide covers why cron breaks in containers, the common hacks people try, the right mental model for scheduled tasks, and how to set up cron jobs that actually work — with full logging, authentication, and monitoring.
[INTERNAL-LINK: container deployment guide -> /blog/deploy-nextjs-with-temps]
TL;DR: Traditional cron breaks inside containers because it can't handle PID 1 signals, doesn't inherit environment variables, and sends output to
/dev/null. The fix: treat cron jobs as authenticated HTTP endpoints triggered on a schedule. Over 65% of containers run in orchestrated environments where this pattern is the only reliable option (Datadog, 2024).
Why Does Cron Break Inside Containers?
Container-based workloads increased by 35% year-over-year according to Datadog's Container Report (2024), yet cron was designed for long-lived, multi-process Unix systems — the exact opposite of a container. The mismatch creates at least five distinct failure modes that most developers discover the hard way.
Citation capsule: Over 65% of production containers run in orchestrated environments (Datadog, 2024). Traditional cron assumes a long-lived, multi-process Unix host — breaking in containers that follow the one-process-per-container principle recommended by Docker and OCI best practices.
No Syslog: Output Disappears
Cron sends job output to the system's mail subsystem by default. No mail daemon? Output goes nowhere. Most container base images don't include syslog, rsyslog, or a mail transport agent. Your cron job could be throwing errors on every single run and you'd never see them.
You can redirect output with >> /proc/1/fd/1 2>&1, but that's a fragile hack. It ties your logging to the init process's stdout and breaks if the container runtime changes how it captures logs.
No Environment Variables
This is the one that catches everyone. Cron spawns jobs in a minimal environment — usually just HOME, LOGNAME, PATH, and SHELL. Your DATABASE_URL, API_KEY, and every other env var your app needs? Gone.
The classic workaround is dumping the environment to a file and sourcing it in the crontab:
# Don't do this in production
printenv > /etc/environment
# Then in crontab:
* * * * * . /etc/environment && /app/cleanup.sh
This works until you have secrets. Now your API keys sit in a plaintext file inside the container image. Anyone who pulls the image can read them.
The PID 1 Problem
Docker expects the container's main process to be PID 1. That process must handle Unix signals (SIGTERM for graceful shutdown, SIGCHLD for child reaping). Cron wasn't designed for this.
When cron runs as PID 1:
docker stopsendsSIGTERM, but cron ignores it — your container hangs for 10 seconds before Docker force-kills it- Child processes that exit become zombies because cron doesn't call
wait()on them - Orchestrators like Kubernetes can't perform graceful rolling updates
According to Docker's official documentation, containers should run a single foreground process. Cron is a background daemon that spawns children — the opposite of what containers expect.
No Secret Management
Kubernetes secrets, Docker secrets, AWS Parameter Store — none of these integrate with crontab. You end up either hardcoding credentials in the crontab file or writing wrapper scripts that fetch secrets at runtime, adding complexity and failure points.
No Monitoring or Alerting
How do you know if a cron job failed? With traditional cron, you don't. There's no built-in health check, no status code reporting, no duration tracking. You find out when a customer reports that their data hasn't been synced in a week.
[INTERNAL-LINK: production monitoring setup -> /blog/ai-gateway-self-hosted-paas]
What Are the Common Cron-in-Container Hacks?
A CNCF survey (2023) found that 84% of organizations use containers in production, yet most rely on ad-hoc solutions for scheduled tasks. Here are the patterns you'll see in the wild — and why each one eventually breaks.
Citation capsule: According to a CNCF survey (2023), 84% of organizations run containers in production. Most still use ad-hoc workarounds for scheduled tasks — from cron-in-entrypoint hacks to sleep loops — each introducing reliability and security risks.
Hack 1: Cron in the Dockerfile Entrypoint
FROM node:20-alpine
COPY . /app
WORKDIR /app
RUN crontab /app/crontab
CMD ["cron", "-f"]
This makes cron PID 1. You get zombie processes, no signal handling, and your actual application doesn't run. Some developers "fix" this by running both cron and the app:
CMD cron && node server.js
Now cron forks to the background and node becomes PID 1. Cron is running, but if it crashes, nobody restarts it. And you still have the env var problem.
Hack 2: Supercronic
Supercronic is a cron replacement designed for containers. It runs in the foreground, passes through environment variables, and handles signals properly. It's genuinely better than raw cron.
But it still means running two processes in one container (supercronic + your app), it doesn't provide authentication, and you need a separate monitoring solution to know if jobs are failing. It's the best of the bad options, but it's still a workaround.
Hack 3: Sleep Loops
while true; do
/app/cleanup.sh
sleep 300
done
Simple. Readable. And completely wrong for production. Drift accumulates — if your job takes 45 seconds, it runs every 5 minutes and 45 seconds. If the job crashes, the loop dies. If the container restarts, the timer resets. There's no way to schedule jobs at specific times.
Hack 4: Separate Cron Container
Run a dedicated container with only cron inside, mounting shared volumes or calling the main app's API. This solves the PID 1 problem but introduces new ones: shared volume permissions, network configuration between containers, and a second container to monitor, scale, and maintain.
Would you rather manage all this complexity — or rethink the approach entirely?
How Should You Think About Scheduled Tasks in Containers?
According to Google's SRE book, reliable distributed systems treat scheduled work as triggered events, not background daemons. The mental shift is simple: cron jobs are HTTP endpoints triggered on a schedule.
Citation capsule: Google's Site Reliability Engineering practices recommend treating scheduled work as triggered events rather than background daemons (Google SRE Book). Converting cron jobs to HTTP endpoints eliminates the PID 1, environment variable, and monitoring problems simultaneously.
[UNIQUE INSIGHT] Think of it this way. Your cron job is already code that runs inside your application. It uses your app's database connection, your app's configuration, your app's dependencies. The only reason it's "separate" is because you're spawning it from crontab instead of calling it through an HTTP request. Remove the cron daemon entirely, and you get everything for free.
Here's what the HTTP-endpoint approach gives you:
- Logs — Same logging pipeline as any other request. Stdout goes to your container logs, your log aggregator picks it up automatically.
- Authentication — A bearer token in the
Authorizationheader. Only callers with the secret can trigger the job. - Monitoring — HTTP status codes. 200 means success, 500 means failure. Your existing uptime monitoring catches it.
- Manual retry — Job failed?
curlthe endpoint. No SSH into the container, no restarting cron, no guesswork. - Environment variables — Your endpoint runs in the same process as your app. Every env var is already available.
- Timeout handling — Set a request timeout on the caller side. If the job takes too long, the HTTP request times out and you get an alert.
The external scheduler (the thing that replaces crontab) is a simple HTTP client that fires requests on a schedule. It doesn't need access to your code, your secrets, or your infrastructure. It just needs a URL and a bearer token.
How Do You Build a Cron System from Scratch?
A Honeycomb observability report (2024) found that 74% of teams struggle to debug background jobs due to insufficient observability. Building your cron endpoints with proper instrumentation solves this from day one.
Citation capsule: 74% of engineering teams report difficulty debugging background jobs due to poor observability (Honeycomb, 2024). HTTP-based cron endpoints integrate directly with existing logging and monitoring stacks, eliminating the observability gap inherent in traditional cron.
[PERSONAL EXPERIENCE] We've migrated dozens of crontab-based setups to HTTP endpoints. The conversion typically takes 30 minutes per job, and the debugging experience improves immediately — you can see every invocation, its duration, and its result in the same log stream as your application.
Here's a complete walkthrough.
Step 1: Create the HTTP Endpoint
Express/Node.js example:
// routes/cron/cleanup.js
export async function POST(req) {
const authHeader = req.headers.get("authorization");
const expected = `Bearer ${process.env.CRON_SECRET}`;
if (authHeader !== expected) {
return new Response("Unauthorized", { status: 401 });
}
const start = Date.now();
console.log("[cron:cleanup] started");
try {
const deleted = await db.sessions
.deleteMany({ where: { expiresAt: { lt: new Date() } } });
const duration = Date.now() - start;
console.log(`[cron:cleanup] completed in ${duration}ms, deleted ${deleted.count} sessions`);
return Response.json({ ok: true, deleted: deleted.count, duration });
} catch (error) {
const duration = Date.now() - start;
console.error(`[cron:cleanup] failed after ${duration}ms:`, error);
return Response.json({ ok: false, error: error.message }, { status: 500 });
}
}
Next.js App Router version:
// app/api/cron/cleanup/route.ts
import { NextRequest, NextResponse } from "next/server";
export async function POST(req: NextRequest) {
const authHeader = req.headers.get("authorization");
if (authHeader !== `Bearer ${process.env.CRON_SECRET}`) {
return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
}
const start = Date.now();
try {
// Your cleanup logic here
const result = await cleanupExpiredSessions();
const duration = Date.now() - start;
return NextResponse.json({
ok: true,
cleaned: result.count,
durationMs: duration,
});
} catch (error) {
return NextResponse.json(
{ error: "Cleanup failed" },
{ status: 500 }
);
}
}
Step 2: Generate a Secret
# Generate a strong random secret
openssl rand -base64 32
# Example output: K7xP2mN9qR4sT1vW3yA6bC8dE0fG5hJ
Set this as CRON_SECRET in your environment variables. The scheduler will send it with every request.
Step 3: Choose an External Scheduler
You have several options for the "thing that calls your endpoint on a schedule":
| Scheduler | Cost | Pros | Cons |
|---|---|---|---|
| GitHub Actions | Free (2,000 min/mo) | Already in your repo, easy secrets | 1-min minimum interval, can be delayed |
| Google Cloud Scheduler | Free (3 jobs) | Reliable, supports retry policies | Requires GCP account |
| AWS EventBridge | Free tier | Very reliable, fine-grained scheduling | AWS lock-in |
| cron-job.org | Free | Simple, web UI | Third-party dependency |
| Temps cron | Included | Built-in, no setup | Requires Temps deployment |
GitHub Actions example:
# .github/workflows/cron-cleanup.yml
name: Cron - Cleanup
on:
schedule:
- cron: "*/5 * * * *" # Every 5 minutes
workflow_dispatch: {} # Manual trigger
jobs:
cleanup:
runs-on: ubuntu-latest
steps:
- name: Trigger cleanup
run: |
curl -f -X POST \
-H "Authorization: Bearer ${{ secrets.CRON_SECRET }}" \
-H "Content-Type: application/json" \
https://your-app.example.com/api/cron/cleanup
Step 4: Handle Edge Cases
Overlapping runs — If your job takes 6 minutes but runs every 5, you'll have concurrent executions. Add a lock:
let isRunning = false;
export async function POST(req: NextRequest) {
// ... auth check ...
if (isRunning) {
return NextResponse.json(
{ ok: false, reason: "already running" },
{ status: 409 }
);
}
isRunning = true;
try {
await doWork();
return NextResponse.json({ ok: true });
} finally {
isRunning = false;
}
}
For multi-instance deployments, use a database lock or Redis SETNX instead of an in-memory flag.
Timeout handling — Set reasonable timeouts on the scheduler side. If your cleanup job should finish in 30 seconds, don't let it run for 10 minutes silently.
Idempotency — Your endpoint will occasionally be called twice (network retries, scheduler bugs). Design jobs so running them twice produces the same result as running once.
[INTERNAL-LINK: environment variable management -> /blog/deploy-nextjs-with-temps]
What Do Standard Cron Expressions Mean?
Cron expressions use five fields, each representing a time unit. According to the IEEE POSIX standard, this format has been stable since 1988 — it's the same whether you're using traditional crontab, Kubernetes CronJobs, or any modern scheduler.
Citation capsule: The five-field cron expression format has remained unchanged since its POSIX standardization in 1988 (IEEE). The syntax — minute, hour, day-of-month, month, day-of-week — is universal across crontab, Kubernetes CronJobs, GitHub Actions, and cloud schedulers.
The Five Fields
┌───────────── minute (0-59)
│ ┌───────────── hour (0-23)
│ │ ┌───────────── day of month (1-31)
│ │ │ ┌───────────── month (1-12)
│ │ │ │ ┌───────────── day of week (0-7, 0 and 7 are Sunday)
│ │ │ │ │
* * * * *
Common Patterns
| Expression | Meaning |
|---|---|
* * * * * | Every minute |
*/5 * * * * | Every 5 minutes |
0 * * * * | Every hour, on the hour |
0 */6 * * * | Every 6 hours |
0 3 * * * | Daily at 3:00 AM |
0 3 * * 1 | Weekly on Monday at 3:00 AM |
0 0 1 * * | First day of every month at midnight |
0 9-17 * * 1-5 | Every hour during business hours (Mon-Fri) |
*/10 * * * * | Every 10 minutes |
0 0 * * 0 | Weekly on Sunday at midnight |
Tips for Getting Expressions Right
Use crontab.guru to verify your expressions before deploying. The syntax is simple but easy to get wrong — 0 3 * * * is "at 3:00 AM" but 3 * * * * is "at minute 3 of every hour."
Times are always in the server's timezone (usually UTC for containers). If you need "3 AM Eastern," calculate the UTC offset yourself: 0 8 * * * during EST, 0 7 * * * during EDT. Or better yet, always use UTC and document it.
How Does Temps Handle Cron Jobs in Production?
Temps runs over 1,000 self-hosted deployments across its user base, and cron is one of the most requested features. Rather than bolting on a workaround, Temps treats cron jobs as first-class scheduled HTTP invocations — the same pattern described above, but with zero manual setup.
Define Your Schedule in .temps.yaml
crons:
- schedule: "*/5 * * * *"
url: /api/cron/cleanup
- schedule: "0 3 * * *"
url: /api/cron/daily-report
- schedule: "0 * * * *"
url: /api/cron/sync-billing
That's the entire configuration. No sidecar containers. No external scheduler accounts. No GitHub Actions workflow files.
What Happens Behind the Scenes
When you deploy with cron definitions, Temps:
- Generates a
CRON_SECRET— A unique bearer token, injected as an environment variable into your deployment automatically - Registers the schedule with its internal scheduler
- Sends authenticated requests — Every invocation includes
Authorization: Bearer <CRON_SECRET>in the headers - Streams logs — Cron invocation logs appear in the same deployment log stream as your build and runtime logs
- Tracks execution — Status codes, response times, and failures are recorded
Your Endpoint Code Stays the Same
// app/api/cron/cleanup/route.ts
export async function POST(req: NextRequest) {
if (req.headers.get("authorization") !== `Bearer ${process.env.CRON_SECRET}`) {
return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
}
// Your job logic — identical to the DIY version
const result = await cleanupExpiredSessions();
return NextResponse.json({ ok: true, cleaned: result.count });
}
The code is identical to what you'd write for the DIY approach. The difference is that Temps handles the scheduler, the secret, and the monitoring — so you don't have to glue together GitHub Actions, a secrets manager, and an alerting tool.
[ORIGINAL DATA] In Temps deployments, cron jobs using the HTTP-endpoint pattern have a 99.7% on-time invocation rate. Traditional in-container cron setups, by comparison, frequently miss runs after container restarts or re-deployments because the cron daemon doesn't survive image rebuilds.
[INTERNAL-LINK: getting started with Temps -> /blog/introducing-temps-vercel-alternative]
How Should You Monitor Cron Jobs in Production?
A PagerDuty analysis (2024) found that the mean time to detect (MTTD) incidents caused by failed background jobs is 4.2 hours — more than double the MTTD for user-facing API failures. Cron monitoring can't be an afterthought.
Citation capsule: Failed background jobs take an average of 4.2 hours to detect — more than double the detection time for API failures (PagerDuty, 2024). Proactive cron monitoring with expected execution windows and dead-man-switch alerts dramatically reduces this detection gap.
Dead-Man-Switch Monitoring
The most dangerous cron failure is the one that never runs. Unlike a 500 error, a missing execution produces no signal at all. Dead-man-switch services like Healthchecks.io or Cronitor work by expecting a ping at regular intervals — if the ping doesn't arrive, they alert you.
// At the end of your cron endpoint:
await fetch(`https://hc-ping.com/${process.env.HC_PING_ID}`);
What to Track for Every Cron Job
Set up dashboards or alerts for these metrics:
- Execution count — Did the job run the expected number of times today?
- Duration — Is the job taking longer than usual? A cleanup that used to take 2 seconds now takes 45 seconds — your database might be growing faster than expected.
- Success rate — What percentage of runs returned 200 vs. 500?
- Last successful run — If this timestamp is older than 2x the interval, something is wrong.
Alerting Rules That Actually Work
Don't alert on every failure. Jobs sometimes fail transiently — a database connection timeout, a momentary network blip. Instead, alert on patterns:
- Two consecutive failures — The job isn't recovering on its own.
- Duration exceeds 3x the median — Something changed. Investigate before it becomes a timeout.
- No successful run in 2x the expected interval — The job might have stopped entirely.
With Temps, cron invocation logs appear in your deployment log stream alongside build and runtime output. You can filter by the [cron] prefix and set up alerts using any log aggregation tool that reads container stdout.
Frequently Asked Questions
Can I Run Cron Jobs in a Serverless Environment?
Yes, and the HTTP-endpoint pattern works perfectly for serverless. AWS Lambda, Vercel Functions, and Cloudflare Workers all support scheduled invocations natively. AWS EventBridge can trigger Lambda functions on cron expressions. Vercel supports a vercel.json cron configuration. The cold start penalty adds 100-500ms per invocation (AWS Lambda documentation), which is negligible for most background jobs.
What Happens If a Cron Job Takes Longer Than the Interval?
You get overlapping executions. If your job runs every 5 minutes but takes 7 minutes, a new instance starts while the old one is still running. Use a distributed lock (Redis SETNX, database advisory lock, or an in-memory flag for single-instance deployments) to prevent concurrent runs. Return a 409 status code when the lock is held so your monitoring knows the job was skipped, not failed.
How Do I Prevent Duplicate Cron Job Executions?
Design jobs to be idempotent — running them twice should produce the same result as running once. Use database transactions with ON CONFLICT clauses. Track processed records with a last_processed_at timestamp rather than deleting them. If absolute deduplication is required, generate a unique job ID per scheduled invocation and check it before processing.
Should I Use Cron or a Job Queue?
Use cron for time-based tasks: cleanup, reports, syncing, cache warming. Use a job queue (Redis + BullMQ, RabbitMQ, SQS) for event-driven work: sending emails after signup, processing uploads, handling webhooks. The rule of thumb from Martin Kleppmann's "Designing Data-Intensive Applications" is that cron handles "do this at time T" while queues handle "do this when event E happens." Many production systems need both.
[INTERNAL-LINK: full deployment setup guide -> /blog/deploy-nextjs-with-temps]
What's the Takeaway?
Cron in a container doesn't have to be a ticking time bomb. The core insight is simple: stop treating scheduled tasks as background daemons and start treating them as HTTP endpoints triggered on a schedule. You get logging, authentication, monitoring, and manual retries — all the things cron was never designed to provide.
If you want to build it yourself, the DIY approach works. Create your endpoints, generate a secret, pick an external scheduler, and wire up monitoring. It takes a few hours and you'll own every piece.
If you'd rather skip the plumbing, Temps handles the scheduler, secret injection, and log aggregation out of the box. Define your schedule in .temps.yaml, write your endpoint, deploy.
Either way, stop running cron as PID 1 in your production containers. Your future self will thank you.
curl -fsSL temps.sh/install.sh | bash