Skip to main content

Platform Engineering — Internal Developer Platforms Explained

· 6 min read
Goel Academy
DevOps & Cloud Learning Hub

Platform engineering is the discipline of building and maintaining self-service toolchains and workflows that enable developers to ship software faster without filing tickets or waiting on ops teams. If DevOps was about breaking down silos, platform engineering is about building the roads that make the journey smooth.

What Is Platform Engineering?

Platform engineering emerged because "you build it, you run it" doesn't scale. When every team has to configure their own CI/CD pipelines, Kubernetes manifests, observability stacks, and security policies, cognitive load explodes. Platform engineering solves this by creating a curated, self-service layer — an Internal Developer Platform (IDP) — on top of your infrastructure.

┌──────────────────────────────────────────────┐
│ Developer Self-Service UI │
│ (Portal / CLI / API / GitOps) │
├──────────────────────────────────────────────┤
│ Internal Developer Platform │
│ ┌──────────┐ ┌───────────┐ ┌─────────────┐ │
│ │ Templates │ │ Workflows │ │ Policies │ │
│ └──────────┘ └───────────┘ └─────────────┘ │
├──────────────────────────────────────────────┤
│ Infrastructure & Services │
│ Kubernetes · Cloud · Databases · Queues │
└──────────────────────────────────────────────┘

The platform team doesn't replace DevOps — it productizes the best DevOps practices into reusable building blocks.

Platform Engineering vs DevOps vs SRE

These three disciplines overlap but serve distinct purposes:

AspectDevOpsSREPlatform Engineering
FocusCulture & collaborationReliability & SLOsDeveloper experience & self-service
Primary userDev + Ops teamsProduction systemsApplication developers
Key metricDeployment frequencyError budget / SLOsDeveloper satisfaction, onboarding time
ApproachBreak silos, automateSoftware engineering for opsBuild internal products
OutputPractices & pipelinesRunbooks, SLIs, toil reductionInternal Developer Platform
Team structureEmbedded or cross-functionalDedicated SRE teamDedicated platform team
Failure mode"Everyone does ops" burnoutAlert fatigueOver-engineering unused features

Platform engineering treats developers as customers and the platform as a product. You measure success by adoption, not enforcement.

Components of an Internal Developer Platform

A mature IDP has five core layers:

# IDP Component Architecture
idp_layers:
developer_interface:
- service_catalog # Backstage, Port, Cortex
- cli_tools # Custom CLI, Scaffolding
- self_service_portal # Templates, One-click environments

integration_and_delivery:
- ci_cd_pipelines # GitHub Actions, ArgoCD
- gitops_workflows # Flux, ArgoCD
- artifact_management # Container registry, Helm charts

resource_management:
- infrastructure_orchestration # Terraform, Crossplane
- environment_management # Namespaces, accounts
- database_provisioning # Operators, managed services

security_and_governance:
- rbac_policies # OPA, Kyverno
- secret_management # Vault, External Secrets
- compliance_checks # Automated audits

observability:
- monitoring_stack # Prometheus, Grafana
- logging_pipeline # ELK, Loki
- tracing # Jaeger, Tempo

Golden Paths: Opinionated but Not Mandatory

A golden path is the recommended, pre-paved way to accomplish a task. It's not a wall — developers can go off-path, but the golden path is so good that they rarely want to.

# Example: Golden path CLI to create a new microservice
$ platform create service \
--name payment-api \
--language go \
--template microservice-grpc \
--team payments

✓ Repository created: github.com/acme/payment-api
✓ CI/CD pipeline configured (GitHub Actions)
✓ Kubernetes namespace: payments-payment-api
✓ Monitoring dashboards provisioned
✓ Service registered in Backstage catalog
✓ Security scanning enabled (Trivy + Snyk)
✓ README and ADR templates added

Your service is ready at: https://payment-api.dev.acme.internal

One command replaces dozens of manual steps, tickets, and context-switching. The developer gets a production-ready service in minutes.

Self-Service Infrastructure with Backstage

Backstage, originally built at Spotify, is the most popular open-source developer portal. It provides a unified UI for your entire IDP.

# backstage/catalog-info.yaml — Service registration
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: payment-api
description: Payment processing service
annotations:
github.com/project-slug: acme/payment-api
backstage.io/techdocs-ref: dir:.
prometheus.io/alert: payment-api-alerts
pagerduty.com/service-id: P1234ABC
tags:
- go
- grpc
- payments
spec:
type: service
lifecycle: production
owner: team-payments
system: payment-system
dependsOn:
- resource:payments-db
- component:auth-service
providesApis:
- payment-api

Backstage gives you a software catalog (who owns what), TechDocs (docs as code), scaffolder (service templates), and a plugin ecosystem with 100+ community plugins for Kubernetes, CI/CD, cost tracking, and more.

Platform Team Structure

A platform team typically operates as an enabling team (per Team Topologies):

┌─────────────────────────────────────────┐
│ Stream-Aligned Teams │
│ (Payments, Search, Checkout, etc.) │
│ ↕ Self-Service ↕ │
├─────────────────────────────────────────┤
│ Platform Team (4-8 people) │
│ ┌───────────┐ ┌──────────┐ ┌────────┐ │
│ │ Platform │ │ DevEx │ │ Infra │ │
│ │ Product Mgr│ │ Engineer │ │ Engineer│ │
│ └───────────┘ └──────────┘ └────────┘ │
│ Treats platform as a product │
└─────────────────────────────────────────┘

Key roles:

  • Platform Product Manager — Gathers developer feedback, prioritizes the roadmap
  • DevEx Engineers — Build the portal, CLI, templates, and golden paths
  • Infrastructure Engineers — Manage the underlying cloud, Kubernetes, and Terraform modules

The golden rule: the platform team should never be a bottleneck. If developers still file tickets to get things done, the platform has failed.

Paved Roads vs Guardrails

These are complementary concepts:

Paved roads make the right thing easy:

# Terraform module — paved road for creating an S3 bucket
module "app_bucket" {
source = "git::https://github.com/acme/terraform-modules//s3-secure"

bucket_name = "payment-api-uploads"
team = "payments"
environment = "production"

# Encryption, versioning, lifecycle policies
# are all pre-configured in the module
}

Guardrails prevent the wrong thing:

# Kyverno policy — guardrail preventing privileged containers
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: disallow-privileged-containers
spec:
validationFailureAction: Enforce
rules:
- name: deny-privileged
match:
any:
- resources:
kinds: ["Pod"]
validate:
message: "Privileged containers are not allowed."
pattern:
spec:
containers:
- securityContext:
privileged: "false"

Together, they create an environment where developers move fast and stay compliant.

Measuring Platform Success

You cannot improve what you don't measure. Here are the metrics that matter:

MetricWhat It MeasuresTarget
Developer Satisfaction (DevEx Score)Quarterly survey (1-10)> 8/10
New Service Lead TimeTime from idea to production-ready service< 30 minutes
Developer Onboarding TimeTime for a new hire to ship first PR< 1 week
Golden Path Adoption% of services using standard templates> 80%
Self-Service Ratio% of requests handled without tickets> 90%
Platform ReliabilityUptime of platform services99.9%
Cognitive Load ScoreSurvey: "How hard is it to ship code?"Decreasing trend

Run quarterly developer surveys. Track time-to-first-deploy for new hires. If a developer opens a Jira ticket to get a database, your platform has a gap.

Real IDP Architecture Example

Here's what a production IDP looks like at a mid-size company (50-200 engineers):

Developer Workflow:

PR Merged → GitHub Actions → Build & Test
│ │
▼ ▼
ArgoCD GitOps ◄── Helm Charts ── Container
│ Registry

Kubernetes Cluster
├── Dev namespace (auto from PR)
├── Staging namespace (auto from main)
└── Prod namespace (manual approval)

Crossplane ──► Cloud Resources (RDS, S3, SQS)
Vault ──► Secrets injection
OPA/Kyverno ──► Policy enforcement
Backstage ──► Service catalog + docs
Prometheus/Grafana ──► Auto-provisioned dashboards

The developer's experience: push code, everything else happens automatically. The platform team's job is to make that "everything else" reliable, secure, and invisible.

Getting Started

You don't need to build a full IDP on day one. Start small:

  1. Week 1-2: Survey developers — what's their biggest pain point?
  2. Month 1: Build one golden path (e.g., "new service" template)
  3. Month 2: Add self-service for the top-requested resource
  4. Month 3: Deploy Backstage with a basic service catalog
  5. Quarter 2: Expand to CI/CD standardization and policy guardrails

The best platforms are built iteratively, driven by developer feedback, not by what looks impressive in an architecture diagram.

Closing Note

Platform engineering is reshaping how organizations think about developer productivity. But platforms don't exist in a vacuum — they need to be resilient. In the next post, we'll explore Chaos Engineering and how to deliberately break your systems to make them stronger.