Deployment Strategies — Blue-Green, Canary, Rolling, and Feature Flags

October 4, 2025 · 7 min read

DevOps & Cloud Learning Hub

Your code passed all tests, the PR is approved, and you're ready to deploy. But how you deploy matters as much as what you deploy. A bad deployment strategy turns a minor bug into a site-wide outage, while a good one lets you roll back in seconds with zero customer impact.

Why Deployment Strategy Matters

The deployment strategy you choose directly affects:

Risk: How many users are impacted if something goes wrong?
Speed: How quickly can you roll back?
Cost: How much extra infrastructure do you need?
Complexity: How hard is it to operate?

There's no single "best" strategy. The right choice depends on your application, traffic patterns, and risk tolerance.

Rolling Update (Kubernetes Default)

A rolling update gradually replaces old pods with new ones, maintaining availability throughout:

# rolling-update-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-api
spec:
  replicas: 6
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1    # At most 1 pod down during update
      maxSurge: 2           # At most 2 extra pods during update
  selector:
    matchLabels:
      app: payment-api
  template:
    metadata:
      labels:
        app: payment-api
    spec:
      containers:
        - name: payment-api
          image: payment-api:v2.1.0
          readinessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 3
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 15
            periodSeconds: 10

Rolling Update Timeline (6 replicas):

Time 0:  [v1] [v1] [v1] [v1] [v1] [v1]        ← All old
Time 1:  [v1] [v1] [v1] [v1] [v1] [v2] [v2]    ← 2 new pods surge
Time 2:  [v1] [v1] [v1] [v1] [v2] [v2] [v2]    ← Old pods terminating
Time 3:  [v1] [v1] [v2] [v2] [v2] [v2]
Time 4:  [v2] [v2] [v2] [v2] [v2] [v2]          ← All new

Pros: Simple, built into Kubernetes, no extra infra. Cons: Both versions serve traffic simultaneously (API compatibility required), slow rollback (re-deploy old version).

Blue-Green Deployment

Blue-green maintains two identical environments. Traffic switches instantly from "blue" (current) to "green" (new):

# blue-green with Kubernetes services
# Step 1: Green deployment is running alongside blue
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-api-green
spec:
  replicas: 6
  selector:
    matchLabels:
      app: payment-api
      version: green
  template:
    metadata:
      labels:
        app: payment-api
        version: green
    spec:
      containers:
        - name: payment-api
          image: payment-api:v2.1.0
---
# Step 2: Switch traffic by updating the service selector
apiVersion: v1
kind: Service
metadata:
  name: payment-api
spec:
  selector:
    app: payment-api
    version: green    # ← Change from "blue" to "green"
  ports:
    - port: 80
      targetPort: 8080

Blue-Green Switch:

Before:  Users → Load Balancer → [Blue: v1.0] (active)
                                  [Green: v2.0] (idle, tested)

Switch:  Users → Load Balancer → [Blue: v1.0] (idle, standby)
                                  [Green: v2.0] (active)

Rollback: Users → Load Balancer → [Blue: v1.0] (active again)
                                   [Green: v2.0] (idle)

Pros: Instant switchover, instant rollback, zero downtime. Cons: Double the infrastructure cost, database migration complexity.

Canary Deployment

A canary deployment routes a small percentage of traffic to the new version before rolling it out to everyone:

Canary Progression:

Stage 1:  5% traffic → v2.0  |  95% traffic → v1.0
          Monitor for 10 minutes...
          ✓ Error rate < 0.1%, latency normal

Stage 2:  25% traffic → v2.0  |  75% traffic → v1.0
          Monitor for 15 minutes...
          ✓ Error rate < 0.1%, latency normal

Stage 3:  50% traffic → v2.0  |  50% traffic → v1.0
          Monitor for 15 minutes...
          ✓ Error rate < 0.1%, latency normal

Stage 4:  100% traffic → v2.0
          ✓ Deployment complete

If any stage fails the health criteria, traffic automatically rolls back to v1.0.

Canary with Argo Rollouts

Argo Rollouts extends Kubernetes with advanced deployment strategies:

# argo-rollout-canary.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: payment-api
spec:
  replicas: 10
  strategy:
    canary:
      canaryService: payment-api-canary
      stableService: payment-api-stable
      trafficRouting:
        istio:
          virtualServices:
            - name: payment-api-vsvc
              routes:
                - primary
      steps:
        # Step 1: 5% traffic to canary
        - setWeight: 5
        - pause: { duration: 10m }

        # Step 2: Run analysis
        - analysis:
            templates:
              - templateName: success-rate
            args:
              - name: service-name
                value: payment-api-canary

        # Step 3: 25% traffic
        - setWeight: 25
        - pause: { duration: 15m }

        # Step 4: 50% traffic
        - setWeight: 50
        - pause: { duration: 15m }

        # Step 5: Full rollout
        - setWeight: 100
  selector:
    matchLabels:
      app: payment-api
  template:
    metadata:
      labels:
        app: payment-api
    spec:
      containers:
        - name: payment-api
          image: payment-api:v2.1.0
---
# Automated analysis template
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: success-rate
spec:
  args:
    - name: service-name
  metrics:
    - name: success-rate
      interval: 60s
      successCondition: result[0] >= 0.99
      provider:
        prometheus:
          address: http://prometheus:9090
          query: |
            sum(rate(http_requests_total{service="{{args.service-name}}",status=~"2.."}[5m]))
            /
            sum(rate(http_requests_total{service="{{args.service-name}}"}[5m]))

Argo Rollouts automates the entire canary process: increment traffic, run Prometheus queries, auto-rollback if metrics degrade.

Feature Flags

Feature flags decouple deployment from release. You deploy code to production but control who sees it:

// Feature flag with Unleash (open-source)
const { initialize, isEnabled } = require('unleash-client');

const unleash = initialize({
  url: 'https://unleash.internal/api',
  appName: 'payment-service',
  customHeaders: { Authorization: process.env.UNLEASH_API_TOKEN },
});

app.post('/checkout', async (req, res) => {
  const context = {
    userId: req.user.id,
    properties: {
      region: req.user.region,
      plan: req.user.plan,
    },
  };

  if (isEnabled('new-payment-flow', context)) {
    // New checkout experience (rolling out to 10% of users)
    return newCheckoutHandler(req, res);
  }

  // Existing checkout experience
  return currentCheckoutHandler(req, res);
});

Feature Flag Rollout:

Day 1:  Deploy v2.0 with flag OFF (0% see new feature)
Day 2:  Enable for internal team (dogfooding)
Day 3:  Enable for 5% of users (beta)
Day 7:  Enable for 25% of users
Day 14: Enable for 100% of users
Day 21: Remove flag and old code (cleanup!)
                                  ↑ Don't forget this step!

Popular feature flag tools:

Tool	Type	Pricing	Best For
LaunchDarkly	SaaS	$$$	Enterprise, advanced targeting
Unleash	OSS / SaaS	Free / $$	Self-hosted, full control
Flipt	OSS	Free	Simple, GitOps-friendly
Flagsmith	OSS / SaaS	Free / $$	Multi-platform SDKs
ConfigCat	SaaS	Free tier	Small teams, simple needs

Strategy Comparison

Strategy	Risk	Rollback Speed	Infra Cost	Complexity	Best For
Rolling Update	Medium	Minutes (re-deploy)	1x	Low	Standard deployments
Blue-Green	Low	Seconds (switch)	2x	Medium	Critical services
Canary	Very Low	Seconds (route back)	1.1x	High	High-traffic services
A/B Testing	Low	Seconds	1.1x	High	UX experiments
Shadow/Dark	None	N/A (no user impact)	2x	Very High	Major rewrites
Feature Flags	Very Low	Instant (toggle)	1x	Medium	Decoupled releases

Database Migrations During Deployments

The hardest part of any deployment strategy is the database. When v1 and v2 run simultaneously, they must both work with the same schema:

Safe Migration Pattern (Expand-Contract):

Phase 1 — Expand (backward compatible):
  ┌──────────────────────────────────────┐
  │ ALTER TABLE users                     │
  │   ADD COLUMN email_v2 VARCHAR(255);   │
  │                                       │
  │ -- Both v1 (uses email) and           │
  │ -- v2 (uses email_v2) work            │
  └──────────────────────────────────────┘

Phase 2 — Migrate data:
  ┌──────────────────────────────────────┐
  │ UPDATE users                          │
  │   SET email_v2 = LOWER(email)         │
  │   WHERE email_v2 IS NULL;             │
  └──────────────────────────────────────┘

Phase 3 — Contract (after v1 is fully gone):
  ┌──────────────────────────────────────┐
  │ ALTER TABLE users                     │
  │   DROP COLUMN email;                  │
  │ ALTER TABLE users                     │
  │   RENAME COLUMN email_v2 TO email;    │
  └──────────────────────────────────────┘

UNSAFE migration (breaks blue-green/canary):
  ALTER TABLE users RENAME COLUMN email TO email_address;
  -- v1 pods immediately crash: column "email" does not exist

SAFE migration:
  1. Add new column (v1 and v2 both work)
  2. Deploy v2 (writes to both columns)
  3. Backfill old data
  4. Remove v1 completely
  5. Drop old column

Never rename or drop a column while both versions are running. The expand-contract pattern ensures zero-downtime migrations.

Closing Note

The best deployment strategy is the one your team can operate confidently. Start with rolling updates, add canary when your traffic justifies it, and use feature flags to decouple deployments from releases. In the next post, we'll explore Infrastructure Testing — how to verify your Terraform, Ansible, and cloud resources are correct before they hit production.

Why Deployment Strategy Matters​

Rolling Update (Kubernetes Default)​

Blue-Green Deployment​

Canary Deployment​

Canary with Argo Rollouts​

Feature Flags​

Strategy Comparison​

Database Migrations During Deployments​

Closing Note​

Stay Updated