Kubernetes Logging — EFK Stack, Loki, and Fluent Bit

August 23, 2025 · 6 min read

DevOps & Cloud Learning Hub

A pod crashes at 3 AM, restarts, and by the time you check in the morning, kubectl logs shows only the current container's output — the crash logs are gone forever. Kubernetes does not persist logs beyond the lifetime of a container, and on a busy cluster, even node-level logs rotate away within hours. If you are not shipping logs to a central store, you are debugging with one eye closed.

Kubernetes Logging Architecture

Kubernetes writes container logs to files on the node at /var/log/containers/. There are two primary patterns to collect these logs:

Node-level logging (DaemonSet): A log collector runs on every node as a DaemonSet, reads log files from /var/log/containers/, and ships them to a central store. This is the most common approach.

Sidecar logging: A logging container runs alongside your application container in the same pod. Used when you need per-application log processing or when logs are not written to stdout.

┌──────────────────────────────────────────┐
│  Node                                     │
│  ┌─────────┐  ┌─────────┐  ┌──────────┐ │
│  │  Pod A   │  │  Pod B   │  │ Fluent   │ │
│  │ (stdout) │  │ (stdout) │  │   Bit    │ │
│  └────┬─────┘  └────┬─────┘  │ DaemonSet│ │
│       │              │        └─────┬────┘ │
│       ▼              ▼              │      │
│  /var/log/containers/*.log ─────────┘      │
│                                     │      │
└─────────────────────────────────────┼──────┘
                                      ▼
                          ┌───────────────────┐
                          │ Elasticsearch/Loki │
                          └───────────────────┘

Option 1: EFK Stack (Elasticsearch + Fluent Bit + Kibana)

The EFK stack is the traditional enterprise-grade logging solution. Elasticsearch indexes and stores logs, Fluent Bit collects and ships them, and Kibana provides a search and visualization UI.

Deploy Elasticsearch

# Add the Elastic Helm repo
helm repo add elastic https://helm.elastic.co
helm repo update

# Install Elasticsearch (3-node cluster)
helm install elasticsearch elastic/elasticsearch \
  --namespace logging \
  --create-namespace \
  --set replicas=3 \
  --set minimumMasterNodes=2 \
  --set resources.requests.memory=2Gi \
  --set resources.limits.memory=4Gi \
  --set volumeClaimTemplate.resources.requests.storage=100Gi

Deploy Fluent Bit as a DaemonSet

Fluent Bit is the lightweight alternative to Fluentd — written in C, uses about 15 MB of memory per node, and handles thousands of events per second.

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush         5
        Log_Level     info
        Daemon        off
        Parsers_File  parsers.conf

    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        Parser            cri
        DB                /var/log/flb_kube.db
        Mem_Buf_Limit     50MB
        Skip_Long_Lines   On
        Refresh_Interval  10

    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_Tag_Prefix     kube.var.log.containers.
        Merge_Log           On
        Merge_Log_Key       log_processed
        Keep_Log            Off
        K8S-Logging.Parser  On
        K8S-Logging.Exclude Off

    [FILTER]
        Name    grep
        Match   kube.*
        Exclude $kubernetes['namespace_name'] kube-system

    [OUTPUT]
        Name            es
        Match           kube.*
        Host            elasticsearch-master
        Port            9200
        Logstash_Format On
        Logstash_Prefix k8s-logs
        Retry_Limit     3
        Replace_Dots    On

  parsers.conf: |
    [PARSER]
        Name        cri
        Format      regex
        Regex       ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<log>.*)$
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z

Deploy Kibana

helm install kibana elastic/kibana \
  --namespace logging \
  --set elasticsearchHosts="http://elasticsearch-master:9200"

# Access Kibana
kubectl port-forward svc/kibana-kibana -n logging 5601:5601

Option 2: Grafana Loki (Lightweight Alternative)

Loki is Grafana's answer to Elasticsearch — but instead of indexing the full text of every log line, it only indexes metadata labels (namespace, pod, container). This makes it dramatically cheaper to run and operate.

# Install Loki stack (Loki + Promtail)
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

helm install loki grafana/loki-stack \
  --namespace logging \
  --create-namespace \
  --set grafana.enabled=true \
  --set loki.persistence.enabled=true \
  --set loki.persistence.size=50Gi

Promtail (Loki's log collector) automatically discovers pods and attaches Kubernetes labels. Query logs in Grafana using LogQL:

# All logs from the production namespace
{namespace="production"}

# Error logs from a specific deployment
{namespace="production", app="payment-service"} |= "error"

# Parse JSON logs and filter by status code
{namespace="production"} | json | status_code >= 500

# Count errors per minute
sum(rate({namespace="production"} |= "error" [1m])) by (app)

Comparison: EFK vs Loki vs Cloud-Native

Feature	EFK Stack	Grafana Loki	CloudWatch / Stackdriver
Full-text indexing	Yes	No (labels only)	Yes
Resource usage	High (8+ GB RAM)	Low (512 MB RAM)	N/A (managed)
Storage cost	High	Low (10-20x cheaper)	Medium
Query language	Kibana KQL	LogQL	Proprietary
Integrates with	Kibana	Grafana	Cloud console
Setup complexity	High	Low	None
Multi-cluster	Complex	Easy with Grafana Cloud	Per-account
Best for	Large enterprises, compliance	Most K8s teams	Cloud-native shops

For most teams, Loki is the right choice. You likely already have Grafana for metrics — adding Loki gives you logs in the same UI with minimal resource overhead. Choose EFK when you need full-text search across billions of log lines or have compliance requirements that demand Elasticsearch.

Structured Logging Best Practices

The biggest difference between logs you can query and logs that are useless is structure. Always log in JSON format:

{
  "timestamp": "2025-08-23T10:15:32.456Z",
  "level": "error",
  "service": "payment-api",
  "trace_id": "abc123def456",
  "message": "Payment processing failed",
  "error": "insufficient_funds",
  "user_id": "usr_789",
  "amount": 49.99,
  "currency": "USD"
}

In your application, use structured logging libraries:

# Python with structlog
import structlog

logger = structlog.get_logger()
logger.error(
    "payment_failed",
    error="insufficient_funds",
    user_id="usr_789",
    amount=49.99,
    currency="USD",
    trace_id=request.headers.get("X-Trace-ID"),
)

Multi-Line Log Handling

Stack traces span multiple lines, and by default, each line becomes a separate log entry. Configure Fluent Bit to concatenate them:

# fluent-bit multiline config
[MULTILINE_PARSER]
    name          java-stacktrace
    type          regex
    flush_timeout 2000
    rule          "start_state"  "/^\d{4}-\d{2}-\d{2}/"  "cont"
    rule          "cont"         "/^\s+(at|Caused by|\.{3})/"  "cont"

[INPUT]
    Name              tail
    Tag               kube.*
    Path              /var/log/containers/payment*.log
    multiline.parser  java-stacktrace

Log Retention and Rotation

Without retention policies, your logging storage will grow indefinitely. Configure cleanup based on your compliance needs:

# Elasticsearch: Create an Index Lifecycle Policy
curl -X PUT "elasticsearch-master:9200/_ilm/policy/k8s-log-policy" \
  -H 'Content-Type: application/json' -d '{
  "policy": {
    "phases": {
      "hot":    { "min_age": "0ms", "actions": { "rollover": { "max_size": "50gb", "max_age": "1d" }}},
      "warm":   { "min_age": "7d",  "actions": { "shrink": { "number_of_shards": 1 }}},
      "delete": { "min_age": "30d", "actions": { "delete": {} }}
    }
  }
}'

# Loki: Set retention in values.yaml
loki:
  config:
    table_manager:
      retention_deletes_enabled: true
      retention_period: 720h    # 30 days
    compactor:
      retention_enabled: true

Centralized Logging for Multi-Cluster

When running multiple clusters, ship all logs to a single central store. With Loki, add a cluster label in Promtail:

# promtail config for multi-cluster
config:
  snippets:
    extraRelabelConfigs:
      - target_label: cluster
        replacement: production-us-east-1

Now you can query logs across clusters in Grafana:

{cluster="production-us-east-1", namespace="checkout"} |= "timeout"

Wrapping Up

Your logging stack is only as good as the structure of your logs. Ship JSON, attach Kubernetes metadata automatically with Fluent Bit, and pick the backend that fits your scale — Loki for most teams, EFK for enterprises that need full-text search, and cloud-native solutions when you want zero operational burden.

With monitoring and logging in place, you can see what is happening and read why. But neither will help you prevent security incidents. In the next post, we will lock down Kubernetes with Pod Security Standards, Network Policies, and OPA Gatekeeper.

Kubernetes Logging Architecture​

Option 1: EFK Stack (Elasticsearch + Fluent Bit + Kibana)​

Deploy Elasticsearch​

Deploy Fluent Bit as a DaemonSet​

Deploy Kibana​

Option 2: Grafana Loki (Lightweight Alternative)​

Comparison: EFK vs Loki vs Cloud-Native​

Structured Logging Best Practices​

Multi-Line Log Handling​

Log Retention and Rotation​

Centralized Logging for Multi-Cluster​

Wrapping Up​

Stay Updated