Kubernetes Deployments — Rolling Updates, Rollbacks, and Scaling
It is 2 AM. Your team pushed a bad image to production. Users are seeing 500 errors. Without Deployments, you would be scrambling to SSH into servers and manually swap containers. With Deployments, you type one command and Kubernetes rolls back to the last working version in seconds. That is the power we are exploring today.
The Deployment-ReplicaSet-Pod Hierarchy
Before diving into commands, understand the chain of ownership:
Deployment (manages)
└── ReplicaSet (manages)
└── Pod (runs containers)
- Deployment defines what you want (image, replicas, update strategy)
- ReplicaSet ensures the right number of pods are running
- Pod actually runs the containers
You almost never create ReplicaSets directly. Deployments create them for you. Each time you update a Deployment, it creates a new ReplicaSet and scales down the old one.
# See the hierarchy in action
kubectl get deployments
kubectl get replicasets
kubectl get pods
# Watch a deployment's ReplicaSets
kubectl get rs -l app=nginx --sort-by='.metadata.creationTimestamp'
Creating a Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
labels:
app: api
spec:
replicas: 3
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: myapi:1.0.0
ports:
- containerPort: 8080
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "250m"
memory: "256Mi"
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
# Apply and watch the rollout
kubectl apply -f deployment.yaml
kubectl rollout status deployment/api-server
# You should see:
# Waiting for deployment "api-server" rollout to finish: 0 of 3 updated replicas are available...
# deployment "api-server" successfully rolled out
Rolling Update Strategy
By default, Deployments use a rolling update strategy. It replaces pods gradually so your app never goes fully offline.
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
replicas: 4
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # At most 1 extra pod during update (4+1=5)
maxUnavailable: 1 # At most 1 pod can be down (4-1=3 minimum)
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: myapi:2.0.0
| Parameter | Value | Effect |
|---|---|---|
| maxSurge: 0 | No extra pods | Slower but uses no additional resources |
| maxSurge: 25% | 25% extra pods (default) | Balanced speed and resource usage |
| maxSurge: 100% | Double the pods | Fastest rollout, blue-green style |
| maxUnavailable: 0 | No downtime allowed | Requires maxSurge > 0, safest option |
| maxUnavailable: 25% | 25% can be down (default) | Balanced availability |
The safest production setting is maxSurge: 1, maxUnavailable: 0. This ensures zero downtime but rolls out one pod at a time.
# Trigger a rolling update by changing the image
kubectl set image deployment/api-server api=myapi:2.0.0
# Watch pods being replaced in real time
kubectl get pods -l app=api -w
# Check rollout progress
kubectl rollout status deployment/api-server
Rollbacks — Your Safety Net
Every Deployment update creates a revision. You can roll back to any previous revision instantly.
# View rollout history
kubectl rollout history deployment/api-server
# See details of a specific revision
kubectl rollout history deployment/api-server --revision=2
# Roll back to the previous version
kubectl rollout undo deployment/api-server
# Roll back to a specific revision
kubectl rollout undo deployment/api-server --to-revision=3
# Verify the rollback
kubectl rollout status deployment/api-server
kubectl describe deployment api-server | grep Image
By default, Kubernetes keeps the last 10 revisions. You can change this:
spec:
revisionHistoryLimit: 5 # Keep only 5 old ReplicaSets
Tip: always use the --record flag or add kubernetes.io/change-cause annotations so your rollout history shows why each revision was created.
# Annotate the change cause
kubectl annotate deployment/api-server kubernetes.io/change-cause="Upgraded to v2.0.0 for new auth module"
Scaling — Manual and Automatic
Manual Scaling
# Scale up to handle traffic
kubectl scale deployment/api-server --replicas=10
# Scale back down after peak
kubectl scale deployment/api-server --replicas=3
# Conditional scaling — only if current replicas match
kubectl scale deployment/api-server --replicas=5 --current-replicas=3
Horizontal Pod Autoscaler (HPA)
Let Kubernetes scale for you based on CPU, memory, or custom metrics:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-server-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300 # Wait 5 min before scaling down
policies:
- type: Percent
value: 10
periodSeconds: 60 # Remove at most 10% per minute
scaleUp:
stabilizationWindowSeconds: 0 # Scale up immediately
policies:
- type: Percent
value: 100
periodSeconds: 15 # Double pods every 15 seconds
# Create HPA from command line
kubectl autoscale deployment api-server --min=3 --max=20 --cpu-percent=70
# Monitor HPA
kubectl get hpa api-server -w
# Check HPA decisions
kubectl describe hpa api-server
Pausing and Resuming Rollouts
Need to make multiple changes before triggering a rollout? Pause it:
# Pause the deployment — changes accumulate but don't trigger rollout
kubectl rollout pause deployment/api-server
# Make multiple changes
kubectl set image deployment/api-server api=myapi:3.0.0
kubectl set resources deployment/api-server -c=api --limits=cpu=500m,memory=512Mi
# Resume — all changes roll out together as a single update
kubectl rollout resume deployment/api-server
This is useful when you need to update the image and resource limits simultaneously. Without pausing, each change would trigger a separate rollout.
Deployment Conditions
Deployments report conditions that tell you exactly what is happening:
| Condition | Status | Meaning |
|---|---|---|
| Available | True | Minimum required pods are ready |
| Progressing | True | Rollout is in progress or recently completed |
| Progressing | False | Rollout is stuck (use kubectl describe for details) |
| ReplicaFailure | True | Cannot create pods (image pull error, insufficient resources) |
# Check conditions programmatically
kubectl get deployment api-server -o jsonpath='{range .status.conditions[*]}{.type}{": "}{.status}{" - "}{.message}{"\n"}{end}'
Recreate Strategy — The Other Option
Sometimes you need all old pods killed before new ones start (for example, when you cannot run two versions of a database migration at once):
spec:
strategy:
type: Recreate # Kill all old pods, then start new ones
This causes downtime but guarantees only one version runs at a time.
Deployment Management Cheat Sheet
# Create
kubectl apply -f deployment.yaml
# Status
kubectl rollout status deployment/api-server
# Update image
kubectl set image deployment/api-server api=myapi:2.0.0
# Rollback
kubectl rollout undo deployment/api-server
# Scale
kubectl scale deployment/api-server --replicas=5
# History
kubectl rollout history deployment/api-server
# Restart all pods (rolling restart)
kubectl rollout restart deployment/api-server
# Delete
kubectl delete deployment api-server
Next up: Kubernetes Services — how pods find and talk to each other using ClusterIP, NodePort, LoadBalancer, and DNS-based service discovery.
