Skip to main content

Docker Init Systems — PID 1, Signal Handling, and Zombie Processes

· 8 min read
Goel Academy
DevOps & Cloud Learning Hub

You run docker stop myapp and wait. After 10 seconds, Docker force-kills the container. Your application never received the shutdown signal, never flushed its write buffers, never closed database connections. Running docker top on a long-running container reveals dozens of zombie processes consuming PIDs. Both problems share the same root cause: your application is running as PID 1 in the container, and it was never designed for that responsibility.

The PID 1 Problem

In Linux, PID 1 is special. It is the init process — the parent of all other processes. The kernel treats PID 1 differently from every other process:

  1. Signals are not delivered by default. The kernel only delivers a signal to PID 1 if PID 1 has explicitly registered a handler for that signal. If your app has no SIGTERM handler, docker stop has no effect for 10 seconds until Docker sends SIGKILL.
  2. Zombie reaping. When a child process exits, its parent must call wait() to clean up the zombie entry. If PID 1 does not reap zombies, they accumulate forever.
# Inside a container — your app is PID 1
docker exec myapp ps aux
# PID USER COMMAND
# 1 root node server.js <-- PID 1, no init system
# 15 root /bin/sh -c ...
# 20 root [defunct] <-- zombie process!
# 21 root [defunct] <-- another zombie!

Why Your Container Ignores Ctrl+C

When you run a container interactively and press Ctrl+C, Docker sends SIGINT to PID 1. But if PID 1 has not registered a SIGINT handler, the kernel silently drops the signal.

# This container will not respond to Ctrl+C
docker run --rm python:3.12 python -c "
import time
while True:
print('running...')
time.sleep(1)
"
# Press Ctrl+C — nothing happens
# You have to open another terminal and run: docker stop <container>

The same applies to docker stop, which sends SIGTERM. If PID 1 has no SIGTERM handler, Docker waits for the stop timeout (default 10 seconds), then sends SIGKILL — an ungraceful, immediate termination.

Shell Form vs Exec Form

The way you write CMD and ENTRYPOINT in your Dockerfile determines whether your application runs as PID 1 or under a shell wrapper.

# SHELL FORM — runs under /bin/sh -c
# PID 1 is /bin/sh, NOT your application
CMD node server.js
# Actual: /bin/sh -c "node server.js"
# PID 1: /bin/sh
# PID 7: node server.js

# EXEC FORM — runs directly, no shell wrapper
# PID 1 IS your application
CMD ["node", "server.js"]
# PID 1: node server.js

Shell form is worse because /bin/sh does not forward signals to child processes. When Docker sends SIGTERM, the shell receives it, but the shell does not pass it to node server.js. Your application never knows it should shut down.

Always use exec form for CMD and ENTRYPOINT.

# BAD — shell form
ENTRYPOINT python main.py
CMD gunicorn app:app

# GOOD — exec form
ENTRYPOINT ["python", "main.py"]
CMD ["gunicorn", "app:app"]

Tini — A Proper Init for Containers

Tini is a tiny (< 1MB) init system designed specifically for containers. It runs as PID 1, forwards signals to your application, and reaps zombie processes.

# Use Docker's built-in --init flag (uses tini internally)
docker run --init myapp:latest

# Now the process tree looks correct:
docker exec myapp ps aux
# PID USER COMMAND
# 1 root /sbin/tini -- node server.js <-- tini is PID 1
# 7 root node server.js <-- your app is PID 7
# Install tini in your Dockerfile
FROM node:20-alpine

# Alpine
RUN apk add --no-cache tini

# Debian/Ubuntu
# RUN apt-get update && apt-get install -y tini && rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY . .

ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "server.js"]

With tini:

  • SIGTERM from docker stop is forwarded to node server.js.
  • Zombie processes are automatically reaped.
  • Your application shuts down gracefully.

dumb-init Alternative

dumb-init (by Yelp) is another lightweight init system. It works identically to tini for most use cases.

FROM python:3.12-slim

RUN pip install dumb-init

WORKDIR /app
COPY . .

ENTRYPOINT ["dumb-init", "--"]
CMD ["python", "main.py"]
# Or download the binary directly
RUN wget -O /usr/local/bin/dumb-init https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_x86_64
RUN chmod +x /usr/local/bin/dumb-init

ENTRYPOINT ["dumb-init", "--"]
Init SystemSizeSignal ForwardingZombie ReapingNotes
None (app as PID 1)0Only if app handles signalsOnly if app calls wait()Problematic for most apps
--init (Docker flag)Built-inYesYesEasiest, no Dockerfile change
tini~30 KBYesYesMost popular, built into Docker
dumb-init~50 KBYes (signal rewriting)YesCan rewrite signals

Graceful Shutdown Patterns

Having the right init system is only half the solution. Your application must handle SIGTERM and shut down gracefully — close connections, flush buffers, finish in-progress requests.

Node.js

// server.js — graceful shutdown for Node.js
const http = require('http');

const server = http.createServer((req, res) => {
res.writeHead(200);
res.end('OK');
});

server.listen(3000, () => console.log('Listening on 3000'));

// Handle SIGTERM from docker stop
process.on('SIGTERM', () => {
console.log('SIGTERM received. Shutting down gracefully...');
server.close(() => {
console.log('HTTP server closed. Exiting.');
process.exit(0);
});

// Force exit after 5 seconds if connections won't close
setTimeout(() => {
console.error('Forced shutdown after timeout');
process.exit(1);
}, 5000);
});

// Also handle SIGINT (Ctrl+C in development)
process.on('SIGINT', () => {
console.log('SIGINT received.');
process.exit(0);
});

Python

# main.py — graceful shutdown for Python
import signal
import sys
import time
from http.server import HTTPServer, SimpleHTTPRequestHandler

server = HTTPServer(('0.0.0.0', 8000), SimpleHTTPRequestHandler)

def graceful_shutdown(signum, frame):
print(f"Signal {signum} received. Shutting down...")
server.shutdown()
sys.exit(0)

signal.signal(signal.SIGTERM, graceful_shutdown)
signal.signal(signal.SIGINT, graceful_shutdown)

print("Server starting on port 8000")
server.serve_forever()

Go

// main.go — graceful shutdown for Go
package main

import (
"context"
"log"
"net/http"
"os"
"os/signal"
"syscall"
"time"
)

func main() {
srv := &http.Server{Addr: ":8080"}
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("OK"))
})

// Start server in a goroutine
go func() {
log.Println("Server starting on :8080")
if err := srv.ListenAndServe(); err != http.ErrServerClosed {
log.Fatalf("Server error: %v", err)
}
}()

// Wait for SIGTERM or SIGINT
quit := make(chan os.Signal, 1)
signal.Notify(quit, syscall.SIGTERM, syscall.SIGINT)
<-quit
log.Println("Shutting down gracefully...")

// Give in-flight requests 10 seconds to complete
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
if err := srv.Shutdown(ctx); err != nil {
log.Fatalf("Forced shutdown: %v", err)
}
log.Println("Server stopped")
}

Java (Spring Boot)

// Spring Boot handles SIGTERM gracefully by default (since 2.3+)
// application.yml
// server:
// shutdown: graceful
// spring:
// lifecycle:
// timeout-per-shutdown-phase: 30s

// For non-Spring Java applications:
public class Main {
public static void main(String[] args) {
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
System.out.println("SIGTERM received. Cleaning up...");
// Close database connections
// Flush write buffers
// Complete in-flight requests
System.out.println("Shutdown complete.");
}));

// Your application logic
startServer();
}
}

Docker Stop Timeout

When docker stop is called, Docker sends SIGTERM and waits for the stop timeout before sending SIGKILL.

# Default timeout: 10 seconds
docker stop myapp

# Custom timeout: 30 seconds for applications that need more time
docker stop --time 30 myapp

# Set default stop timeout in Dockerfile
# STOPSIGNAL SIGTERM is the default, but you can change it
STOPSIGNAL SIGTERM
# docker-compose.yml — set stop timeout per service
services:
api:
image: myapp:latest
stop_grace_period: 30s # 30 seconds to shut down gracefully

worker:
image: myworker:latest
stop_grace_period: 60s # Workers processing long jobs need more time
stop_signal: SIGTERM

Health Check + Graceful Shutdown Combo

Combine health checks with graceful shutdown so the orchestrator stops sending traffic before the container begins its shutdown sequence.

FROM node:20-alpine
RUN apk add --no-cache tini curl

WORKDIR /app
COPY . .

HEALTHCHECK --interval=10s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1

ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "server.js"]
// server.js — health endpoint that reflects shutdown state
let isShuttingDown = false;

app.get('/health', (req, res) => {
if (isShuttingDown) {
res.status(503).json({ status: 'shutting_down' });
} else {
res.status(200).json({ status: 'healthy' });
}
});

process.on('SIGTERM', () => {
isShuttingDown = true; // Health check fails → load balancer stops traffic
console.log('SIGTERM: health check will now return 503');

// Wait a few seconds for load balancer to detect the unhealthy status
setTimeout(() => {
server.close(() => process.exit(0));
}, 5000);
});

This pattern ensures zero-downtime deployments: the container marks itself unhealthy, the load balancer drains connections, and only then does the application close its listener and exit.

Wrapping Up

The PID 1 problem catches everyone eventually — that first time docker stop takes exactly 10 seconds and your application did not log "shutting down." The fix is straightforward: use --init or tini as your ENTRYPOINT, always use exec form for CMD, implement SIGTERM handlers in your application code, and set an appropriate stop grace period. These four steps give you clean shutdowns, no zombie processes, and zero-downtime deployments.

In the next post, we will cover Docker Overlay Networks — how containers communicate across multiple hosts using VXLAN tunneling, encrypted overlays, and the ingress network.