DevOps Is Not a Tool — Culture, CALMS, and the Three Ways
Your company just posted a job listing for a "DevOps Engineer" who must know Jenkins, Terraform, Kubernetes, and 47 other tools. Congratulations — you have completely missed the point of DevOps.
The Origin Story: How a Frustrated Belgian Changed Everything
In 2008, Andrew Shafer proposed a talk called "Agile Infrastructure" at the Agile Conference in Toronto. Almost nobody showed up. But one person did — Patrick Debois, a Belgian IT consultant who was exhausted by the wall between dev and ops teams.
Debois had been working on a large data center migration. The developers would throw code over the wall, operations would scramble to deploy it, things would break at 2 AM, and everyone would blame each other. Sound familiar?
In 2009, inspired by a talk by John Allspaw and Paul Hammond at Flickr titled "10+ Deploys Per Day," Debois organized the first DevOpsDays conference in Ghent, Belgium. The Twitter hashtag #DevOps was born, and the rest is history.
But here is the critical thing everyone forgets: DevOps was never about tools. It was about breaking down silos between teams.
What DevOps Actually Is
DevOps is a cultural and professional movement that emphasizes collaboration between software developers and IT operations. It aims to shorten the systems development lifecycle while delivering features, fixes, and updates frequently and reliably.
Let me be blunt about what DevOps is NOT:
# DevOps is NOT:
# ❌ A job title
# ❌ A team name
# ❌ A tool or product you can buy
# ❌ Just automation
# ❌ Just CI/CD
# DevOps IS:
# ✅ A culture of shared responsibility
# ✅ Breaking down silos between teams
# ✅ Continuous improvement
# ✅ Automating everything you can
# ✅ Measuring what matters
The CALMS Framework
Jez Humble (co-author of Continuous Delivery) proposed the CALMS framework to assess whether an organization is truly adopting DevOps or just slapping a new label on the same old problems.
| Pillar | What It Means | Red Flag If Missing |
|---|---|---|
| Culture | Shared responsibility, blameless postmortems, trust between teams | Finger-pointing after outages, "that's not my job" mentality |
| Automation | Automate builds, tests, deployments, infrastructure provisioning | Manual deployments, "works on my machine" syndrome |
| Lean | Limit WIP, eliminate waste, value stream mapping, small batch sizes | Huge releases every quarter, features nobody asked for |
| Measurement | Track deployment frequency, lead time, MTTR, change failure rate | No metrics, gut-feel decision making, vanity dashboards |
| Sharing | Knowledge sharing, cross-functional teams, open communication | Knowledge hoarded by individuals, tribal knowledge |
Here is a quick self-assessment you can run with your team:
# Quick CALMS Self-Assessment
# Rate each pillar 1-5. Be honest.
echo "=== CALMS Assessment ==="
echo "Culture: Do devs and ops share on-call? (1-5)"
echo "Automation: Can you deploy with one command? (1-5)"
echo "Lean: Is your WIP limit defined and enforced? (1-5)"
echo "Measurement: Do you track DORA metrics? (1-5)"
echo "Sharing: Does your team do blameless postmortems? (1-5)"
echo ""
echo "Score 20+: You are doing DevOps"
echo "Score 15-19: Getting there"
echo "Score <15: You have a DevOps title, not DevOps culture"
The Three Ways of DevOps
Gene Kim's book The Phoenix Project introduced The Three Ways, which are foundational principles for DevOps.
The First Way: Flow (Systems Thinking)
Optimize the entire system, not individual silos. Work should flow left-to-right from Dev to Ops to the customer with minimal friction.
# The First Way in practice: a deployment pipeline
# Work flows continuously from commit to production
pipeline:
- stage: commit
action: "Developer pushes code"
- stage: build
action: "Automated build triggers"
- stage: test
action: "Unit + integration tests run"
- stage: staging
action: "Deploy to staging automatically"
- stage: production
action: "One-click deploy to production"
# Key principle: Make work visible, limit WIP,
# reduce batch sizes, eliminate bottlenecks
The Second Way: Feedback (Amplify Feedback Loops)
Create fast, constant feedback from right-to-left. When production breaks, the team that built it should know immediately — not three weeks later from a customer complaint.
# The Second Way in practice: fast feedback
# Monitoring that alerts the right people
# Instead of: ops gets paged, opens ticket, waits 3 days
# Do this: the team that deployed gets immediate feedback
# Example: Set up alerting that goes to the dev team
curl -X POST https://api.pagerduty.com/incidents \
-H "Content-Type: application/json" \
-d '{
"incident": {
"type": "incident",
"title": "High error rate after deploy v2.3.1",
"service": {
"id": "SERVICE_ID",
"type": "service_reference"
},
"urgency": "high"
}
}'
The Third Way: Continuous Learning (Experimentation and Learning)
Foster a culture of experimentation and learning from failure. Allocate time for improvement, run game days, and conduct blameless postmortems.
# Blameless Postmortem Template (The Third Way)
## Incident Summary
- **Date:** 2025-02-20
- **Duration:** 45 minutes
- **Impact:** 12% of users saw 500 errors on checkout
## Timeline
- 14:02 — Deploy v2.3.1 rolled out
- 14:15 — Monitoring alert fired (error rate > 5%)
- 14:18 — On-call engineer acknowledged
- 14:32 — Root cause identified (DB connection pool exhaustion)
- 14:47 — Rollback completed, errors resolved
## Root Cause
Connection pool sized for 50 connections, new feature opened 3x more connections per request.
## Action Items
- [ ] Add connection pool metrics to dashboard
- [ ] Load test new features before deploy
- [ ] Add circuit breaker for DB connections
## What Went Well
- Alert fired within 13 minutes
- Rollback was one command
## What We Learned
- Need better capacity planning for DB-heavy features
Anti-Patterns: You Are Doing DevOps Wrong If...
Watch out for these common traps that organizations fall into:
1. Renaming Ops to DevOps. You took your operations team, changed their title to "DevOps Engineers," and declared victory. Nothing else changed. Developers still throw code over the wall.
2. The DevOps Team silo. You created a new "DevOps Team" that sits between Dev and Ops. You just added a third silo. Congratulations.
3. Tool worship. "We use Kubernetes, so we do DevOps." No. You can run Kubernetes and still have a terrible deployment process with zero collaboration.
4. Automation without culture change. You automated deployments but developers are still not allowed to deploy to production. That is just faster waterfall.
# The anti-pattern detector
# If any of these are true, you have a problem:
echo "Anti-Pattern Checklist:"
echo "[ ] Devs cannot deploy their own code"
echo "[ ] Only one person knows how the pipeline works"
echo "[ ] Postmortems assign blame to individuals"
echo "[ ] 'DevOps' is a separate team, not a practice"
echo "[ ] Release day is still a stressful event"
echo "[ ] You measure lines of code instead of outcomes"
DevOps vs SRE vs Platform Engineering
These three approaches are related but distinct. Here is how they compare:
| Aspect | DevOps | SRE | Platform Engineering |
|---|---|---|---|
| Origin | Patrick Debois, 2009 | Google, 2003 | Evolution of DevOps, ~2020 |
| Focus | Culture + collaboration | Reliability + engineering | Developer experience + self-service |
| Key Metric | Deployment frequency | Error budgets, SLOs | Developer productivity |
| Who Does It | Everyone (culture shift) | Dedicated SRE team | Platform team |
| Approach | Break down silos | "Class SRE implements DevOps" | Build internal platforms |
| Tools | CI/CD, IaC, monitoring | SLI/SLO frameworks, toil tracking | Internal developer portals (Backstage) |
| Risk Model | Move fast, iterate | Error budgets allow controlled risk | Golden paths reduce risk |
As Ben Treynor (VP Engineering at Google) said: "SRE is what happens when you ask a software engineer to design an operations function."
DORA Metrics: Measuring What Matters
The DevOps Research and Assessment (DORA) team identified four key metrics that predict software delivery performance:
# The Four DORA Metrics
# 1. Deployment Frequency
# How often do you deploy to production?
# Elite: Multiple times per day | Low: Less than once per month
# 2. Lead Time for Changes
# Time from commit to production
# Elite: Less than 1 hour | Low: More than 6 months
# 3. Change Failure Rate
# % of deployments causing a failure
# Elite: 0-15% | Low: 46-60%
# 4. Mean Time to Restore (MTTR)
# How long to recover from a failure
# Elite: Less than 1 hour | Low: More than 6 months
# Check your team's performance:
echo "=== DORA Metrics Quick Check ==="
echo "Deployment Frequency: ______ (daily/weekly/monthly/quarterly)"
echo "Lead Time for Changes: ______ (hours/days/weeks/months)"
echo "Change Failure Rate: ______% "
echo "MTTR: ______ (minutes/hours/days/weeks)"
These metrics are not vanity numbers. Google's research across thousands of organizations shows that elite performers have 208x more frequent deployments and 106x faster lead times compared to low performers. And they are more stable, not less.
Where to Start
If you are reading this and thinking "my organization does none of this," do not panic. Start small:
- Pick one CALMS pillar and improve it this quarter
- Measure your DORA metrics — you cannot improve what you do not measure
- Run a blameless postmortem after your next incident
- Automate one manual step in your deployment process
- Share knowledge — write a runbook, do a lunch-and-learn
DevOps is a journey, not a destination. The organizations that win are the ones that never stop improving.
Next up in our DevOps series, we will dive into Git Workflows — comparing Trunk-Based Development, GitFlow, and GitHub Flow to help you pick the right branching strategy for your team.
