Skip to main content

Top 50 Kubernetes Interview Questions for DevOps Engineers

· 13 min read
Goel Academy
DevOps & Cloud Learning Hub

Whether you are preparing for a DevOps engineer role, a platform engineering position, or the CKA/CKAD certification, these 50 questions cover what interviewers actually ask. Each answer is concise enough to say in an interview but detailed enough to demonstrate real understanding.

Beginner Questions (1-15)

1. What is Kubernetes and why do we need it?

Kubernetes is an open-source container orchestration platform that automates deployment, scaling, and management of containerized applications. We need it because manually managing containers across servers does not scale — Kubernetes handles scheduling, self-healing, load balancing, and rolling updates automatically.

2. What is a Pod?

A Pod is the smallest deployable unit in Kubernetes. It is a wrapper around one or more containers that share the same network namespace (same IP, same ports) and storage volumes. In practice, most pods run a single container.

kubectl run nginx --image=nginx:latest
kubectl get pods -o wide

3. What is the difference between a Pod and a Deployment?

A Pod is a single instance. If it dies, it is gone. A Deployment manages a ReplicaSet, which ensures a desired number of pod replicas are running at all times. If a pod crashes, the Deployment creates a replacement.

apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
replicas: 3 # Deployment ensures 3 pods always run
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: web
image: nginx:1.25

4. What are the types of Kubernetes Services?

TypeScopeUse Case
ClusterIPInternal onlyService-to-service communication
NodePortExternal via node IP:port (30000-32767)Development, testing
LoadBalancerExternal via cloud LBProduction traffic
ExternalNameDNS CNAME aliasPointing to external services

5. What is a Namespace?

A Namespace is a virtual cluster within Kubernetes. It provides isolation for resources, RBAC, and resource quotas. Teams typically get their own namespace.

kubectl get namespaces
kubectl create namespace staging
kubectl get pods -n staging

6. How do you scale a Deployment?

# Imperative
kubectl scale deployment web --replicas=5

# Declarative — edit the YAML and apply
kubectl edit deployment web
# Or use HPA for automatic scaling
kubectl autoscale deployment web --min=2 --max=10 --cpu-percent=70

7. What is kubectl? Name 5 essential commands.

kubectl is the CLI tool for interacting with Kubernetes clusters. Five essential commands:

kubectl get pods                    # List resources
kubectl describe pod <name> # Detailed resource info
kubectl logs <pod-name> # View container logs
kubectl exec -it <pod> -- /bin/sh # Shell into a container
kubectl apply -f manifest.yaml # Apply configuration

8. What is the difference between kubectl create and kubectl apply?

create is imperative — it fails if the resource already exists. apply is declarative — it creates the resource if it does not exist and updates it if it does. In production, always use apply.

9. What are Labels and Selectors?

Labels are key-value pairs attached to objects. Selectors filter objects by their labels. Services route traffic to pods using label selectors. Deployments manage pods using label selectors.

kubectl get pods -l app=web,environment=production
kubectl label pod nginx tier=frontend

10. What is a ConfigMap?

A ConfigMap stores non-sensitive configuration data as key-value pairs. Pods consume ConfigMaps as environment variables or mounted files.

kubectl create configmap app-config \
--from-literal=DB_HOST=postgres.production.svc \
--from-literal=LOG_LEVEL=info

11. What is the difference between a ConfigMap and a Secret?

ConfigMaps store non-sensitive data in plaintext. Secrets store sensitive data (passwords, tokens, keys) in base64 encoding. Secrets can be encrypted at rest with encryption providers.

12. What are the Pod phases?

Pending (waiting for scheduling or image pull), Running (at least one container running), Succeeded (all containers completed successfully), Failed (at least one container failed), Unknown (cannot determine pod status, usually node communication failure).

13. What is a DaemonSet?

A DaemonSet ensures one pod runs on every node (or every matching node). Use cases: log collectors (Fluentd), monitoring agents (Prometheus node-exporter), network plugins (Calico/Cilium).

14. What is a StatefulSet?

A StatefulSet manages stateful applications. Unlike Deployments, it provides stable network identities (pod-0, pod-1), ordered startup/shutdown, and persistent storage per pod. Used for databases, message queues, and distributed systems.

15. How do you check the logs of a crashed pod?

# Current container logs
kubectl logs <pod-name>

# Previous container logs (after a crash restart)
kubectl logs <pod-name> --previous

# Follow logs in real-time
kubectl logs <pod-name> -f

# Logs from a specific container in a multi-container pod
kubectl logs <pod-name> -c <container-name>

Intermediate Questions (16-35)

16. Explain Kubernetes networking model.

Every pod gets its own IP address. Pods can communicate with any other pod across nodes without NAT. This is the flat network model. It is implemented by CNI plugins (Calico, Cilium, Flannel) that set up routing between nodes.

17. What is an Ingress?

An Ingress exposes HTTP/HTTPS routes from outside the cluster to Services inside the cluster. It provides path-based and host-based routing, TLS termination, and load balancing. It requires an Ingress Controller (NGINX, Traefik, ALB) to function.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
spec:
rules:
- host: app.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 8080

18. What is a PersistentVolume (PV) and PersistentVolumeClaim (PVC)?

A PV is a piece of storage provisioned by an admin or dynamically by a StorageClass. A PVC is a user request for storage. PVCs bind to PVs — this abstraction separates storage provisioning from consumption.

19. Explain RBAC in Kubernetes.

RBAC (Role-Based Access Control) has four objects: Role (namespace-scoped permissions), ClusterRole (cluster-wide permissions), RoleBinding (binds Role to user/group in a namespace), ClusterRoleBinding (binds ClusterRole cluster-wide).

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pod-reader
namespace: production
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]

20. What is Helm and why use it?

Helm is a package manager for Kubernetes. It bundles Kubernetes manifests into charts with templating and versioning. Benefits: reusable templates, parameterized deployments, easy rollbacks, dependency management.

helm install my-app ./my-chart --values production-values.yaml
helm upgrade my-app ./my-chart --values production-values.yaml
helm rollback my-app 1

21. What are Liveness, Readiness, and Startup probes?

ProbePurposeOn Failure
LivenessIs the container alive?Container is restarted
ReadinessCan it accept traffic?Removed from Service endpoints
StartupHas it finished starting?Liveness/readiness probes wait

22. How does HPA work?

HPA (Horizontal Pod Autoscaler) watches metrics (CPU, memory, custom metrics) and adjusts replica count. It queries metrics every 15 seconds (default) and scales when thresholds are breached.

kubectl autoscale deployment web --cpu-percent=70 --min=2 --max=20
kubectl get hpa

23. What is a Network Policy?

A Network Policy is a firewall rule for pods. By default, all pods can communicate. Network Policies restrict which pods can talk to which. Requires a CNI that supports them (Calico, Cilium — not Flannel).

24. Explain the difference between a Job and a CronJob.

A Job runs a pod to completion (batch processing, migration). It retries on failure. A CronJob creates Jobs on a schedule (like Linux cron). Use cases: database backups, report generation, cleanup scripts.

25. What is a ServiceAccount?

A ServiceAccount provides identity for pods. Pods use ServiceAccounts to authenticate with the API server and other services. Each namespace has a default ServiceAccount. Create dedicated ones for least-privilege access.

26. What is kube-proxy and how does it work?

kube-proxy runs on every node and implements Service networking. It maintains iptables or IPVS rules that route traffic to healthy pod endpoints. When a Service is created, kube-proxy updates the rules on all nodes.

27. How do rolling updates work?

A rolling update replaces old pods with new ones incrementally. Controlled by maxSurge (how many extra pods during update) and maxUnavailable (how many can be down). This ensures zero downtime during deployments.

strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0 # Zero downtime

28. What is a PodDisruptionBudget?

A PDB limits the number of pods that can be down simultaneously during voluntary disruptions (node drain, cluster upgrade). It prevents Kubernetes from evicting too many pods at once.

29. How do you pass environment variables to a pod?

env:
- name: DB_HOST
value: "postgres.production.svc" # Direct value
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-secret
key: password # From Secret
- name: LOG_LEVEL
valueFrom:
configMapKeyRef:
name: app-config
key: LOG_LEVEL # From ConfigMap

30. What is a Sidecar container?

A sidecar runs alongside the main container in the same pod. It extends functionality without modifying the main container. Examples: log shipping (Fluentd sidecar), service mesh proxy (Envoy), TLS termination. Kubernetes 1.29+ has native sidecar support with restartPolicy: Always in init containers.

31. What is the difference between Deployment, StatefulSet, and DaemonSet?

FeatureDeploymentStatefulSetDaemonSet
IdentityRandom pod namesStable names (pod-0, pod-1)One per node
StorageShared PVCPer-pod PVCUsually hostPath
OrderingParallelSequentialN/A
Use caseStateless appsDatabases, queuesNode agents

32. What is an Init Container?

An init container runs before the main container starts. It runs to completion and must succeed before the next init container or the main container starts. Use cases: database migration, downloading config, waiting for dependencies.

33. How do you troubleshoot a pod stuck in Pending state?

kubectl describe pod <pod-name>  # Check Events section
# Common causes:
# - Insufficient CPU/memory (scale up nodes or reduce requests)
# - No matching node (check nodeSelector, affinity, taints)
# - PVC not bound (check PV availability)
# - Image pull error (check image name and registry credentials)

34. What is Resource QoS in Kubernetes?

QoS ClassConditionEviction Priority
Guaranteedrequests == limits for all containersLast to be evicted
Burstablerequests < limitsEvicted after BestEffort
BestEffortNo requests or limits setFirst to be evicted

35. Explain the Kubernetes control plane components.

ComponentRole
kube-apiserverREST API frontend, handles all operations
etcdKey-value store for all cluster state
kube-schedulerAssigns pods to nodes
kube-controller-managerRuns controllers (replication, node, endpoint)
cloud-controller-managerIntegrates with cloud provider APIs

Advanced Questions (36-50)

36. What is an Operator and when would you write one?

An Operator is a custom controller that uses CRDs to manage complex applications. It encodes operational knowledge (scaling, backup, failover) in code. Write one when your application has complex lifecycle requirements that Helm charts cannot handle — databases, message queues, distributed systems.

37. What is a CRD?

A Custom Resource Definition extends the Kubernetes API with your own resource types. Combined with a controller, CRDs let you manage anything through kubectl — databases, certificates, DNS records.

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: databases.myorg.io
spec:
group: myorg.io
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
engine:
type: string
version:
type: string
replicas:
type: integer
scope: Namespaced
names:
plural: databases
singular: database
kind: Database

38. How does a Service Mesh work?

A service mesh (Istio, Linkerd) injects a sidecar proxy (Envoy) into every pod. All traffic goes through the proxy, enabling mTLS encryption, traffic management, observability, and retry policies without changing application code.

39. How do you secure a Kubernetes cluster?

Key areas: Enable RBAC (least privilege), enforce Network Policies (default deny), use Pod Security Standards (restricted profile), encrypt Secrets at rest, scan images in CI/CD, rotate certificates, disable anonymous auth, use admission controllers (OPA/Kyverno), and keep Kubernetes version updated.

40. How do you troubleshoot a node in NotReady state?

kubectl describe node <node-name>   # Check Conditions and Events
# Check kubelet on the node:
systemctl status kubelet
journalctl -u kubelet -f --no-pager | tail -50
# Common causes: kubelet crashed, certificate expired, disk pressure,
# memory pressure, network partition, container runtime failure

41. What is the difference between kubectl drain, cordon, and uncordon?

kubectl cordon <node>    # Mark unschedulable (no new pods, existing pods stay)
kubectl drain <node> # Cordon + evict all pods (respects PDBs)
kubectl uncordon <node> # Mark schedulable again

42. How do you implement zero-downtime deployments?

Combine: rolling update strategy with maxUnavailable: 0, readiness probes (remove unhealthy pods from service), preStop hooks (drain connections before shutdown), PDB (prevent disruption below minimum), and terminationGracePeriodSeconds high enough for graceful drain.

43. Explain multi-cluster Kubernetes patterns.

Active-Active (all clusters serve traffic, zero RTO), Active-Passive (standby cluster for failover), Hub-Spoke (central management, edge workloads), Federated (shared config across independent clusters). Tools: KubeFed, Submariner, Cluster API, ArgoCD ApplicationSet.

44. How does etcd work in Kubernetes?

etcd is a distributed key-value store using the Raft consensus protocol. It stores all cluster state. Requires an odd number of members (3 or 5) for quorum. It is the most critical component — if etcd is lost without backup, the cluster is gone.

45. What is Pod Security Admission?

PSA replaces PodSecurityPolicy (removed in 1.25). It enforces security profiles at the namespace level: privileged (unrestricted), baseline (prevents known privilege escalations), restricted (hardened best practices). Applied via namespace labels.

46. How do you handle secrets management in production?

Never store secrets in Git. Use: External Secrets Operator (syncs from Vault/AWS SM/Azure KV), Sealed Secrets (encrypted in Git, decrypted in cluster), CSI Secrets Store Driver (mounts secrets from external providers as volumes), or SOPS with ArgoCD.

47. What is the Kubernetes Garbage Collector?

The garbage collector deletes dependent objects when their owner is deleted. It uses owner references. Two modes: foreground (delete dependents first, then owner) and background (delete owner first, dependents asynchronously). This is why deleting a Deployment also deletes its ReplicaSet and pods.

48. How do you debug networking issues in Kubernetes?

# 1. Check pod IP and DNS
kubectl exec -it <pod> -- nslookup <service-name>

# 2. Test connectivity between pods
kubectl exec -it <pod-a> -- curl <pod-b-ip>:8080

# 3. Check Service endpoints
kubectl get endpoints <service-name>

# 4. Check Network Policies
kubectl get networkpolicy -n <namespace>

# 5. Check kube-proxy and iptables rules
kubectl logs -n kube-system -l k8s-app=kube-proxy

# 6. Use ephemeral debug container
kubectl debug -it <pod> --image=nicolaka/netshoot --target=<container>

49. What is the difference between Horizontal and Vertical Pod Autoscaling?

HPA changes replica count (more pods). VPA changes resource requests/limits (bigger pods). HPA is for stateless workloads. VPA is for single-instance or stateful workloads. They should not be used together on the same metric (CPU) — use Multidimensional Pod Autoscaler or KEDA instead.

50. Design a highly available Kubernetes architecture for a fintech application.

Answer should cover: multi-AZ control plane (3 masters), dedicated node pools (system, application, database), Pod Anti-Affinity (spread replicas), PDB (minAvailable: 2), HPA (CPU + custom metrics), Network Policies (default deny + allow list), mTLS via service mesh, Secrets in HashiCorp Vault, Velero backups to cross-region S3, monitoring with Prometheus + Grafana, alerting with PagerDuty, GitOps with ArgoCD, and DR cluster in a second region with active-passive failover.


These 50 questions cover the breadth of what you will face in a Kubernetes interview. Do not just memorize answers — spin up a cluster on Minikube or kind and actually run the commands. Interviewers can tell the difference between someone who read the docs and someone who has broken things in a real cluster and fixed them.