Skip to main content

ext4 vs XFS vs Btrfs — Choosing the Right Linux Filesystem

· 7 min read
Goel Academy
DevOps & Cloud Learning Hub

You're about to format a 10TB disk — which filesystem should you choose? The answer depends on your workload: how many files, how large, sequential vs random I/O, whether you need snapshots, and how much you value data integrity. Let's cut through the marketing and look at what actually matters.

Quick Decision Matrix

Criteriaext4XFSBtrfs
Max volume size1 EB8 EB16 EB
Max file size16 TB8 EB16 EB
Online growYesYesYes
Online shrinkYesNoYes
SnapshotsNoNoYes (native)
CompressionNoNoYes (zstd, lzo)
ChecksumsMetadata onlyMetadata onlyData + metadata
RAID supportNo (use mdadm)No (use mdadm)Built-in (RAID 0/1/10)
Best forGeneral purposeLarge files, databasesSnapshots, data integrity

ext4: The Reliable Default

ext4 has been the default Linux filesystem since 2008. It's battle-tested, well-understood, and does nothing surprising — exactly what you want in production.

Creating an ext4 Filesystem

# Standard ext4 with optimal defaults
sudo mkfs.ext4 -L datavolume /dev/sdb1

# Optimized for many small files (increase inode count)
sudo mkfs.ext4 -L mailstore -i 4096 /dev/sdb1

# Optimized for few large files (decrease inode count, increase block size)
sudo mkfs.ext4 -L mediastore -i 65536 -T largefile4 /dev/sdb1

# Disable journaling for pure-read workloads (like a read-only cache)
sudo mkfs.ext4 -O ^has_journal -L cache /dev/sdb1

# Check current filesystem parameters
sudo tune2fs -l /dev/sdb1

ext4 Tuning

# Mount options for production
# /etc/fstab entry:
# /dev/sdb1 /data ext4 defaults,noatime,nodiratime,errors=remount-ro 0 2

# noatime — don't update access time on reads (significant I/O savings)
# nodiratime — don't update directory access time
# errors=remount-ro — go read-only on errors instead of crashing

# Reserve less space for root (default 5% is wasteful on large disks)
sudo tune2fs -m 1 /dev/sdb1 # Reserve only 1%

# Enable directory indexing for directories with many files
sudo tune2fs -O dir_index /dev/sdb1

When to Choose ext4

  • General-purpose Linux servers
  • Boot partitions (best compatibility with GRUB)
  • When you need to shrink the filesystem later
  • When stability and familiarity matter most
  • Workloads under 16TB per volume

XFS: Built for Scale

XFS was designed by Silicon Graphics for handling massive files and parallel I/O. It excels with large sequential workloads and multi-threaded writes.

Creating an XFS Filesystem

# Standard XFS
sudo mkfs.xfs -L bigdata /dev/sdb1

# Optimized for RAID (match stripe unit and width)
# For RAID-10 with 4 disks, 256K stripe:
sudo mkfs.xfs -L raidvol -d su=256k,sw=2 /dev/md0

# Optimized for SSDs (use 4K sector size)
sudo mkfs.xfs -L ssdvol -s size=4096 /dev/nvme0n1p1

# Check filesystem info
xfs_info /dev/sdb1

XFS Performance Tuning

# Mount options for production
# /dev/sdb1 /data xfs defaults,noatime,logbufs=8,logbsize=256k 0 2

# logbufs=8 — increase log buffers for write-heavy workloads
# logbsize=256k — larger log buffer size
# allocsize=64m — pre-allocate in 64MB chunks for streaming writes

# Defragment online (XFS supports this natively)
sudo xfs_fsr /dev/sdb1

# Check fragmentation level
sudo xfs_db -c frag -r /dev/sdb1

# Grow XFS to fill available space (online, no unmount needed)
sudo xfs_growfs /data

XFS Advantages

XFS has a key architectural advantage for parallel workloads: allocation groups. The filesystem is divided into independent groups, each with its own free space index. Multiple threads can allocate space simultaneously without contention.

# Check allocation groups
xfs_info /data | grep agcount
# agcount=4 means 4 independent allocation groups

When to Choose XFS

  • RHEL/Rocky/AlmaLinux default (best tested on these distros)
  • Databases (PostgreSQL, MySQL) with large data directories
  • Media storage and streaming workloads
  • Anything over 16TB per volume
  • Parallel I/O workloads (multiple threads writing simultaneously)
  • Kubernetes persistent volumes (default for many CSI drivers)

Btrfs: The Modern Copy-on-Write Filesystem

Btrfs brings modern filesystem features to Linux: snapshots, compression, checksums, and built-in RAID. It uses copy-on-write (CoW) semantics, which changes how you think about data management.

Creating a Btrfs Filesystem

# Single device
sudo mkfs.btrfs -L snapvol /dev/sdb1

# Multi-device with RAID1 metadata and RAID0 data
sudo mkfs.btrfs -L raidvol -m raid1 -d raid0 /dev/sdb /dev/sdc

# Check filesystem details
sudo btrfs filesystem show /dev/sdb1

Subvolumes and Snapshots

Subvolumes are Btrfs's killer feature. They're like lightweight partitions within a filesystem.

# Mount the filesystem
sudo mount /dev/sdb1 /mnt/btrfs

# Create subvolumes
sudo btrfs subvolume create /mnt/btrfs/@data
sudo btrfs subvolume create /mnt/btrfs/@backups

# List subvolumes
sudo btrfs subvolume list /mnt/btrfs

# Take a read-only snapshot (instant, zero-cost)
sudo btrfs subvolume snapshot -r /mnt/btrfs/@data /mnt/btrfs/@data-snapshot-$(date +%Y%m%d)

# Take a writable snapshot (for testing changes)
sudo btrfs subvolume snapshot /mnt/btrfs/@data /mnt/btrfs/@data-test

# Delete a snapshot when no longer needed
sudo btrfs subvolume delete /mnt/btrfs/@data-test

Transparent Compression

# Mount with zstd compression (best ratio/speed balance)
sudo mount -o compress=zstd:3 /dev/sdb1 /data

# Check compression ratio
sudo compsize /data
# Shows original size vs compressed size per file type

# Force-compress existing files
sudo btrfs filesystem defragment -r -czstd /data
AlgorithmSpeedRatioBest For
lzoFastest~60%CPU-bound workloads
zlibSlow~45%Archival storage
zstd:1Fast~50%General purpose
zstd:3Medium~45%Default recommendation
zstd:15Slow~35%Maximum compression

Btrfs Data Integrity

Btrfs checksums all data and metadata, detecting bit rot and silent corruption.

# Scrub the filesystem (verify all checksums — run monthly)
sudo btrfs scrub start /data

# Check scrub status
sudo btrfs scrub status /data

# If corruption is found with RAID1, Btrfs auto-repairs from the good copy

When to Choose Btrfs

  • When you need filesystem-level snapshots (database backups, system rollbacks)
  • When compression saves significant disk space (logs, text-heavy data)
  • When data integrity matters (checksumming detects silent corruption)
  • openSUSE/SUSE environments (Btrfs is their default and best supported)
  • Home NAS and backup servers

Performance Comparison

Real-world performance depends heavily on the workload. Here are general patterns.

# Quick benchmark with fio (install: apt install fio)

# Sequential write test (simulates log writing, database WAL)
fio --name=seqwrite --rw=write --bs=1M --size=4G --numjobs=4 \
--directory=/data --group_reporting --runtime=60

# Random read/write (simulates database OLTP)
fio --name=randmixed --rw=randrw --bs=4K --size=2G --numjobs=8 \
--directory=/data --group_reporting --runtime=60 --rwmixread=70

# Many small files (simulates mail server, container images)
fio --name=smallfiles --rw=randwrite --bs=4K --size=512M --numjobs=16 \
--nrfiles=1000 --directory=/data --group_reporting --runtime=60
Workloadext4XFSBtrfs
Sequential write (large files)GoodExcellentGood (CoW overhead)
Random 4K IOPSGoodGoodFair
Many small filesGoodGoodFair
Parallel writesGoodExcellentGood
Compressed dataN/AN/AExcellent savings
Snapshot creationN/AN/AInstant

Practical Recommendations by Workload

WorkloadRecommendedWhy
Web serverext4Simple, reliable, low overhead
PostgreSQL/MySQLXFSParallel I/O, large file handling
Kubernetes nodeXFSRHEL ecosystem default, CSI driver support
Backup serverBtrfsSnapshots + compression save huge space
Log aggregationXFSStreaming writes, large files
NFS serverXFS or ext4Proven stability at scale
Development VMBtrfsSnapshot before experiments, rollback instantly

Converting Between Filesystems

You can't convert in-place (except ext3 to ext4). Migration requires backup and reformat.

# Backup, reformat, restore
sudo rsync -aAXv /old-mount/ /backup-location/
sudo umount /dev/sdb1
sudo mkfs.xfs -f -L newvol /dev/sdb1
sudo mount /dev/sdb1 /new-mount
sudo rsync -aAXv /backup-location/ /new-mount/

Filesystem performance is only part of the picture — the network layer often matters more. Next, we explore advanced Linux networking: VLANs, bridges, network namespaces, and the primitives that Kubernetes networking is built on.