Skip to main content

Docker Resource Limits — CPU, Memory, and Why Your App Gets OOM Killed

· 9 min read
Goel Academy
DevOps & Cloud Learning Hub

Your application ran fine for three weeks. Then at 2 AM, the container was killed with exit code 137. No error in the application logs. No exception. Just dead. The culprit: the Linux OOM (Out of Memory) killer. Your container consumed all available memory, and the kernel chose violence. Resource limits prevent this — but most people never set them, and those who do often set them wrong.

Why Resource Limits Matter

Without resource limits, containers compete for the host's CPU and memory like a free-for-all buffet. One misbehaving container — a memory leak, an infinite loop, a runaway query — can starve every other container on the host.

The risks are concrete:

  • OOM kills. The kernel kills your process (exit code 137) when memory is exhausted. No graceful shutdown. No final log message.
  • Noisy neighbor. One container's CPU spike causes latency spikes in all other containers on the same host.
  • Cascading failures. Container A eats all memory, container B (the database) gets OOM-killed, container C loses its database connection and crashes.
  • Unpredictable scheduling. Without limits, orchestrators like Kubernetes cannot make intelligent placement decisions.

Memory Limits

Setting Memory Limits

# Hard memory limit: container is killed if it exceeds 512 MB
docker run -d --name api --memory=512m myapp:latest

# Memory + swap limit (total of memory + swap)
docker run -d --name api --memory=512m --memory-swap=1g myapp:latest
# 512 MB RAM + 512 MB swap = 1 GB total

# Disable swap entirely (recommended for production)
docker run -d --name api --memory=512m --memory-swap=512m myapp:latest
# Setting --memory-swap equal to --memory means zero swap

# Soft limit (reservation) — not enforced, used for scheduling hints
docker run -d --name api \
--memory=512m \
--memory-reservation=256m \
myapp:latest
FlagWhat It DoesEnforced?
--memoryHard limit. Container killed if exceededYes (OOM kill)
--memory-swapTotal memory + swap allowedYes
--memory-reservationSoft limit, scheduling hintNo (best effort)
--oom-kill-disablePrevent OOM killing (dangerous)N/A
# Check current memory limit
docker inspect myapp --format '{{.HostConfig.Memory}}'
# 536870912 (512 MB in bytes)

How OOM Kills Happen

When a container exceeds its memory limit, the Linux kernel's OOM killer terminates the process with SIGKILL (signal 9). The container exits with code 137 (128 + 9).

# Simulate OOM kill with a stress test
docker run -d --name oom-test --memory=100m ubuntu:latest \
bash -c "stress-ng --vm 1 --vm-bytes 200M --timeout 60s"

# Watch it get killed
docker wait oom-test
# 137

docker inspect oom-test --format '{{.State.OOMKilled}}'
# true

CPU Limits

Docker offers three mechanisms for CPU control, and they work differently.

--cpus (Fractional CPU Limit)

# Limit to 1.5 CPU cores
docker run -d --name api --cpus=1.5 myapp:latest

# Limit to half a CPU core
docker run -d --name worker --cpus=0.5 myapp:latest

This is a hard limit. Even if the host has 8 idle cores, the container cannot exceed 1.5 cores. This is the simplest and most recommended flag.

--cpu-shares (Relative Weight)

# Default is 1024. Give the API double priority over the worker
docker run -d --name api --cpu-shares=2048 myapp:latest
docker run -d --name worker --cpu-shares=512 myworker:latest

CPU shares only matter when containers compete for CPU. If the host is idle, a container with 512 shares can use all available CPU. Shares are relative weights, not hard limits.

--cpuset-cpus (Pin to Specific Cores)

# Pin container to CPU cores 0 and 1
docker run -d --name api --cpuset-cpus="0,1" myapp:latest

# Pin to cores 0 through 3
docker run -d --name worker --cpuset-cpus="0-3" myworker:latest
FlagTypeBehavior When Host Is Idle
--cpusHard limitCannot exceed, even if cores are free
--cpu-sharesRelative weightUses all available CPU
--cpuset-cpusCore pinningRestricted to specified cores

Monitoring with docker stats

docker stats is your real-time dashboard for container resource usage.

# Live stats for all containers
docker stats

# Specific containers
docker stats api worker db

# One-shot for scripting (no streaming)
docker stats --no-stream \
--format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.MemPerc}}\t{{.NetIO}}\t{{.BlockIO}}"
NAME     CPU %   MEM USAGE / LIMIT   MEM %   NET I/O        BLOCK I/O
api 2.34% 245MiB / 512MiB 47.85% 1.2MB / 500kB 10MB / 0B
worker 85.3% 480MiB / 512MiB 93.75% 500kB / 200kB 5MB / 1MB
db 12.5% 1.2GiB / 2GiB 60.00% 3MB / 15MB 50MB / 200MB

That worker at 93.75% memory is about to get OOM-killed. Time to either increase the limit or fix the memory leak.

# Monitor and alert when memory exceeds 80%
docker stats --no-stream --format '{{.Name}} {{.MemPerc}}' | while read name pct; do
value=$(echo "$pct" | sed 's/%//')
if [ "$(echo "$value > 80" | bc)" -eq 1 ]; then
echo "WARNING: $name is at $pct memory"
fi
done

Resource Limits in Docker Compose

# docker-compose.yml
services:
api:
image: myapp:latest
deploy:
resources:
limits:
cpus: '1.0'
memory: 512M
reservations:
cpus: '0.25'
memory: 128M

worker:
image: myworker:latest
deploy:
resources:
limits:
cpus: '2.0'
memory: 1G
reservations:
cpus: '0.5'
memory: 256M

db:
image: postgres:16-alpine
deploy:
resources:
limits:
cpus: '2.0'
memory: 2G
reservations:
cpus: '1.0'
memory: 1G

Note: In Compose v2, the deploy.resources section works even without Swarm mode. In older Compose v2 files, you might need mem_limit and cpus at the service level.

Cgroup Enforcement Under the Hood

Docker resource limits are enforced by Linux cgroups (control groups). When you set --memory=512m, Docker creates a cgroup with that memory limit.

# Find the cgroup for a container
docker inspect myapp --format '{{.Id}}'
# abc123def456...

# Check the memory limit in the cgroup filesystem (cgroup v2)
cat /sys/fs/cgroup/system.slice/docker-abc123def456.scope/memory.max
# 536870912

# Check current memory usage
cat /sys/fs/cgroup/system.slice/docker-abc123def456.scope/memory.current
# 245366784

# Check CPU limit
cat /sys/fs/cgroup/system.slice/docker-abc123def456.scope/cpu.max
# 150000 100000 (150ms per 100ms period = 1.5 CPUs)

Detecting OOM Kills

When a container is OOM-killed, Docker records it. But the information is in different places depending on what you check.

# Check if the container was OOM-killed
docker inspect myapp --format '{{.State.OOMKilled}}'
# true

# Check the exit code (137 = SIGKILL from OOM)
docker inspect myapp --format '{{.State.ExitCode}}'
# 137

# Check docker events for OOM events
docker events --filter event=oom --since 24h

# Check kernel logs for OOM messages
dmesg | grep -i "out of memory"
dmesg | grep -i "oom"

# Detailed OOM kill information
dmesg | grep -A 5 "Memory cgroup out of memory"
# Script to check all containers for recent OOM kills
for id in $(docker ps -aq); do
name=$(docker inspect -f '{{.Name}}' $id | sed 's/\///')
oom=$(docker inspect -f '{{.State.OOMKilled}}' $id)
exit_code=$(docker inspect -f '{{.State.ExitCode}}' $id)
if [ "$oom" = "true" ] || [ "$exit_code" = "137" ]; then
echo "OOM KILLED: $name (exit code: $exit_code)"
fi
done

Java Memory Inside Containers

Java is notorious for container memory issues. The JVM has its own memory management (heap, metaspace, thread stacks) that can conflict with container limits.

# BAD: JVM does not respect container limits (older JVMs)
docker run --memory=512m openjdk:8 java -jar app.jar
# JVM sees host's 16GB RAM, allocates 4GB heap, gets OOM-killed

# GOOD: Tell JVM to respect container limits
docker run --memory=512m openjdk:21 java \
-XX:+UseContainerSupport \
-XX:MaxRAMPercentage=75.0 \
-jar app.jar
# JVM allocates 75% of 512MB = 384MB heap
JVM FlagPurpose
-XX:+UseContainerSupportDetect container memory limits (default on since JDK 10)
-XX:MaxRAMPercentage=75.0Use 75% of container memory for heap
-XX:InitialRAMPercentage=50.0Start with 50% of container memory
-XX:MinRAMPercentage=50.0Min heap for small containers

Leave 25% of container memory for non-heap JVM memory (metaspace, thread stacks, native memory, GC overhead).

Node.js Memory Inside Containers

Node.js has a similar problem. V8's default heap limit does not respect container limits.

# Set Node.js max heap to match container limit
docker run --memory=512m myapp:latest \
node --max-old-space-size=384 server.js
# 384MB heap + ~128MB for V8 overhead = within 512MB limit

# Or use the automatic flag (Node 12+)
docker run --memory=512m myapp:latest \
node --max-old-space-size=$(( 512 * 3 / 4 )) server.js

In your Dockerfile:

FROM node:20-alpine
WORKDIR /app
COPY . .
RUN npm ci --only=production

# Set heap to 75% of expected container memory
ENV NODE_OPTIONS="--max-old-space-size=384"
CMD ["node", "server.js"]

GPU Limits

For machine learning workloads, Docker can allocate GPU access.

# Use all GPUs
docker run --gpus all nvidia-ml-app:latest

# Use specific GPUs
docker run --gpus '"device=0,1"' nvidia-ml-app:latest

# Use a specific number of GPUs
docker run --gpus 2 nvidia-ml-app:latest

Requires the NVIDIA Container Toolkit installed on the host.

Best Practices for Right-Sizing

StepActionTool
1Run without limits and observedocker stats
2Set memory limit at 1.5x observed peak--memory
3Set CPU limit at observed average + 50% headroom--cpus
4Load test and watch for OOM killsdocker events --filter event=oom
5Adjust limits based on production metricsPrometheus + Grafana
# Step 1: Run and observe for 24 hours
docker stats --no-stream --format '{{.Name}},{{.CPUPerc}},{{.MemUsage}}' >> resource-log.csv

# Step 2: Analyze peaks
# If peak memory was 320MB, set limit to ~512MB
# If average CPU was 0.3 cores with spikes to 0.8, set limit to 1.0

Do not set limits too tight. A memory limit that is too close to actual usage means every traffic spike triggers an OOM kill. Leave at least 25-50% headroom.

Wrapping Up

Resource limits are not optional in production. Without them, a single misbehaving container can take down everything on the host. Set memory limits to prevent OOM kills from cascading, set CPU limits to prevent noisy neighbor problems, and monitor with docker stats to catch issues before they become outages. For Java and Node.js, remember to configure the runtime's memory settings to match the container limit — they do not do this automatically.

In the next post, we will cover Docker Storage — the differences between bind mounts, named volumes, and tmpfs mounts, with performance benchmarks and guidance on when to use each type.