Docker Init Systems — PID 1, Signal Handling, and Zombie Processes
You run docker stop myapp and wait. After 10 seconds, Docker force-kills the container. Your application never received the shutdown signal, never flushed its write buffers, never closed database connections. Running docker top on a long-running container reveals dozens of zombie processes consuming PIDs. Both problems share the same root cause: your application is running as PID 1 in the container, and it was never designed for that responsibility.
The PID 1 Problem
In Linux, PID 1 is special. It is the init process — the parent of all other processes. The kernel treats PID 1 differently from every other process:
- Signals are not delivered by default. The kernel only delivers a signal to PID 1 if PID 1 has explicitly registered a handler for that signal. If your app has no SIGTERM handler,
docker stophas no effect for 10 seconds until Docker sends SIGKILL. - Zombie reaping. When a child process exits, its parent must call
wait()to clean up the zombie entry. If PID 1 does not reap zombies, they accumulate forever.
# Inside a container — your app is PID 1
docker exec myapp ps aux
# PID USER COMMAND
# 1 root node server.js <-- PID 1, no init system
# 15 root /bin/sh -c ...
# 20 root [defunct] <-- zombie process!
# 21 root [defunct] <-- another zombie!
Why Your Container Ignores Ctrl+C
When you run a container interactively and press Ctrl+C, Docker sends SIGINT to PID 1. But if PID 1 has not registered a SIGINT handler, the kernel silently drops the signal.
# This container will not respond to Ctrl+C
docker run --rm python:3.12 python -c "
import time
while True:
print('running...')
time.sleep(1)
"
# Press Ctrl+C — nothing happens
# You have to open another terminal and run: docker stop <container>
The same applies to docker stop, which sends SIGTERM. If PID 1 has no SIGTERM handler, Docker waits for the stop timeout (default 10 seconds), then sends SIGKILL — an ungraceful, immediate termination.
Shell Form vs Exec Form
The way you write CMD and ENTRYPOINT in your Dockerfile determines whether your application runs as PID 1 or under a shell wrapper.
# SHELL FORM — runs under /bin/sh -c
# PID 1 is /bin/sh, NOT your application
CMD node server.js
# Actual: /bin/sh -c "node server.js"
# PID 1: /bin/sh
# PID 7: node server.js
# EXEC FORM — runs directly, no shell wrapper
# PID 1 IS your application
CMD ["node", "server.js"]
# PID 1: node server.js
Shell form is worse because /bin/sh does not forward signals to child processes. When Docker sends SIGTERM, the shell receives it, but the shell does not pass it to node server.js. Your application never knows it should shut down.
Always use exec form for CMD and ENTRYPOINT.
# BAD — shell form
ENTRYPOINT python main.py
CMD gunicorn app:app
# GOOD — exec form
ENTRYPOINT ["python", "main.py"]
CMD ["gunicorn", "app:app"]
Tini — A Proper Init for Containers
Tini is a tiny (< 1MB) init system designed specifically for containers. It runs as PID 1, forwards signals to your application, and reaps zombie processes.
# Use Docker's built-in --init flag (uses tini internally)
docker run --init myapp:latest
# Now the process tree looks correct:
docker exec myapp ps aux
# PID USER COMMAND
# 1 root /sbin/tini -- node server.js <-- tini is PID 1
# 7 root node server.js <-- your app is PID 7
# Install tini in your Dockerfile
FROM node:20-alpine
# Alpine
RUN apk add --no-cache tini
# Debian/Ubuntu
# RUN apt-get update && apt-get install -y tini && rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY . .
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "server.js"]
With tini:
- SIGTERM from
docker stopis forwarded tonode server.js. - Zombie processes are automatically reaped.
- Your application shuts down gracefully.
dumb-init Alternative
dumb-init (by Yelp) is another lightweight init system. It works identically to tini for most use cases.
FROM python:3.12-slim
RUN pip install dumb-init
WORKDIR /app
COPY . .
ENTRYPOINT ["dumb-init", "--"]
CMD ["python", "main.py"]
# Or download the binary directly
RUN wget -O /usr/local/bin/dumb-init https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_x86_64
RUN chmod +x /usr/local/bin/dumb-init
ENTRYPOINT ["dumb-init", "--"]
| Init System | Size | Signal Forwarding | Zombie Reaping | Notes |
|---|---|---|---|---|
| None (app as PID 1) | 0 | Only if app handles signals | Only if app calls wait() | Problematic for most apps |
--init (Docker flag) | Built-in | Yes | Yes | Easiest, no Dockerfile change |
| tini | ~30 KB | Yes | Yes | Most popular, built into Docker |
| dumb-init | ~50 KB | Yes (signal rewriting) | Yes | Can rewrite signals |
Graceful Shutdown Patterns
Having the right init system is only half the solution. Your application must handle SIGTERM and shut down gracefully — close connections, flush buffers, finish in-progress requests.
Node.js
// server.js — graceful shutdown for Node.js
const http = require('http');
const server = http.createServer((req, res) => {
res.writeHead(200);
res.end('OK');
});
server.listen(3000, () => console.log('Listening on 3000'));
// Handle SIGTERM from docker stop
process.on('SIGTERM', () => {
console.log('SIGTERM received. Shutting down gracefully...');
server.close(() => {
console.log('HTTP server closed. Exiting.');
process.exit(0);
});
// Force exit after 5 seconds if connections won't close
setTimeout(() => {
console.error('Forced shutdown after timeout');
process.exit(1);
}, 5000);
});
// Also handle SIGINT (Ctrl+C in development)
process.on('SIGINT', () => {
console.log('SIGINT received.');
process.exit(0);
});
Python
# main.py — graceful shutdown for Python
import signal
import sys
import time
from http.server import HTTPServer, SimpleHTTPRequestHandler
server = HTTPServer(('0.0.0.0', 8000), SimpleHTTPRequestHandler)
def graceful_shutdown(signum, frame):
print(f"Signal {signum} received. Shutting down...")
server.shutdown()
sys.exit(0)
signal.signal(signal.SIGTERM, graceful_shutdown)
signal.signal(signal.SIGINT, graceful_shutdown)
print("Server starting on port 8000")
server.serve_forever()
Go
// main.go — graceful shutdown for Go
package main
import (
"context"
"log"
"net/http"
"os"
"os/signal"
"syscall"
"time"
)
func main() {
srv := &http.Server{Addr: ":8080"}
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("OK"))
})
// Start server in a goroutine
go func() {
log.Println("Server starting on :8080")
if err := srv.ListenAndServe(); err != http.ErrServerClosed {
log.Fatalf("Server error: %v", err)
}
}()
// Wait for SIGTERM or SIGINT
quit := make(chan os.Signal, 1)
signal.Notify(quit, syscall.SIGTERM, syscall.SIGINT)
<-quit
log.Println("Shutting down gracefully...")
// Give in-flight requests 10 seconds to complete
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
if err := srv.Shutdown(ctx); err != nil {
log.Fatalf("Forced shutdown: %v", err)
}
log.Println("Server stopped")
}
Java (Spring Boot)
// Spring Boot handles SIGTERM gracefully by default (since 2.3+)
// application.yml
// server:
// shutdown: graceful
// spring:
// lifecycle:
// timeout-per-shutdown-phase: 30s
// For non-Spring Java applications:
public class Main {
public static void main(String[] args) {
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
System.out.println("SIGTERM received. Cleaning up...");
// Close database connections
// Flush write buffers
// Complete in-flight requests
System.out.println("Shutdown complete.");
}));
// Your application logic
startServer();
}
}
Docker Stop Timeout
When docker stop is called, Docker sends SIGTERM and waits for the stop timeout before sending SIGKILL.
# Default timeout: 10 seconds
docker stop myapp
# Custom timeout: 30 seconds for applications that need more time
docker stop --time 30 myapp
# Set default stop timeout in Dockerfile
# STOPSIGNAL SIGTERM is the default, but you can change it
STOPSIGNAL SIGTERM
# docker-compose.yml — set stop timeout per service
services:
api:
image: myapp:latest
stop_grace_period: 30s # 30 seconds to shut down gracefully
worker:
image: myworker:latest
stop_grace_period: 60s # Workers processing long jobs need more time
stop_signal: SIGTERM
Health Check + Graceful Shutdown Combo
Combine health checks with graceful shutdown so the orchestrator stops sending traffic before the container begins its shutdown sequence.
FROM node:20-alpine
RUN apk add --no-cache tini curl
WORKDIR /app
COPY . .
HEALTHCHECK --interval=10s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "server.js"]
// server.js — health endpoint that reflects shutdown state
let isShuttingDown = false;
app.get('/health', (req, res) => {
if (isShuttingDown) {
res.status(503).json({ status: 'shutting_down' });
} else {
res.status(200).json({ status: 'healthy' });
}
});
process.on('SIGTERM', () => {
isShuttingDown = true; // Health check fails → load balancer stops traffic
console.log('SIGTERM: health check will now return 503');
// Wait a few seconds for load balancer to detect the unhealthy status
setTimeout(() => {
server.close(() => process.exit(0));
}, 5000);
});
This pattern ensures zero-downtime deployments: the container marks itself unhealthy, the load balancer drains connections, and only then does the application close its listener and exit.
Wrapping Up
The PID 1 problem catches everyone eventually — that first time docker stop takes exactly 10 seconds and your application did not log "shutting down." The fix is straightforward: use --init or tini as your ENTRYPOINT, always use exec form for CMD, implement SIGTERM handlers in your application code, and set an appropriate stop grace period. These four steps give you clean shutdowns, no zombie processes, and zero-downtime deployments.
In the next post, we will cover Docker Overlay Networks — how containers communicate across multiple hosts using VXLAN tunneling, encrypted overlays, and the ingress network.
