ext4 vs XFS vs Btrfs — Choosing the Right Linux Filesystem
You're about to format a 10TB disk — which filesystem should you choose? The answer depends on your workload: how many files, how large, sequential vs random I/O, whether you need snapshots, and how much you value data integrity. Let's cut through the marketing and look at what actually matters.
Quick Decision Matrix
| Criteria | ext4 | XFS | Btrfs |
|---|---|---|---|
| Max volume size | 1 EB | 8 EB | 16 EB |
| Max file size | 16 TB | 8 EB | 16 EB |
| Online grow | Yes | Yes | Yes |
| Online shrink | Yes | No | Yes |
| Snapshots | No | No | Yes (native) |
| Compression | No | No | Yes (zstd, lzo) |
| Checksums | Metadata only | Metadata only | Data + metadata |
| RAID support | No (use mdadm) | No (use mdadm) | Built-in (RAID 0/1/10) |
| Best for | General purpose | Large files, databases | Snapshots, data integrity |
ext4: The Reliable Default
ext4 has been the default Linux filesystem since 2008. It's battle-tested, well-understood, and does nothing surprising — exactly what you want in production.
Creating an ext4 Filesystem
# Standard ext4 with optimal defaults
sudo mkfs.ext4 -L datavolume /dev/sdb1
# Optimized for many small files (increase inode count)
sudo mkfs.ext4 -L mailstore -i 4096 /dev/sdb1
# Optimized for few large files (decrease inode count, increase block size)
sudo mkfs.ext4 -L mediastore -i 65536 -T largefile4 /dev/sdb1
# Disable journaling for pure-read workloads (like a read-only cache)
sudo mkfs.ext4 -O ^has_journal -L cache /dev/sdb1
# Check current filesystem parameters
sudo tune2fs -l /dev/sdb1
ext4 Tuning
# Mount options for production
# /etc/fstab entry:
# /dev/sdb1 /data ext4 defaults,noatime,nodiratime,errors=remount-ro 0 2
# noatime — don't update access time on reads (significant I/O savings)
# nodiratime — don't update directory access time
# errors=remount-ro — go read-only on errors instead of crashing
# Reserve less space for root (default 5% is wasteful on large disks)
sudo tune2fs -m 1 /dev/sdb1 # Reserve only 1%
# Enable directory indexing for directories with many files
sudo tune2fs -O dir_index /dev/sdb1
When to Choose ext4
- General-purpose Linux servers
- Boot partitions (best compatibility with GRUB)
- When you need to shrink the filesystem later
- When stability and familiarity matter most
- Workloads under 16TB per volume
XFS: Built for Scale
XFS was designed by Silicon Graphics for handling massive files and parallel I/O. It excels with large sequential workloads and multi-threaded writes.
Creating an XFS Filesystem
# Standard XFS
sudo mkfs.xfs -L bigdata /dev/sdb1
# Optimized for RAID (match stripe unit and width)
# For RAID-10 with 4 disks, 256K stripe:
sudo mkfs.xfs -L raidvol -d su=256k,sw=2 /dev/md0
# Optimized for SSDs (use 4K sector size)
sudo mkfs.xfs -L ssdvol -s size=4096 /dev/nvme0n1p1
# Check filesystem info
xfs_info /dev/sdb1
XFS Performance Tuning
# Mount options for production
# /dev/sdb1 /data xfs defaults,noatime,logbufs=8,logbsize=256k 0 2
# logbufs=8 — increase log buffers for write-heavy workloads
# logbsize=256k — larger log buffer size
# allocsize=64m — pre-allocate in 64MB chunks for streaming writes
# Defragment online (XFS supports this natively)
sudo xfs_fsr /dev/sdb1
# Check fragmentation level
sudo xfs_db -c frag -r /dev/sdb1
# Grow XFS to fill available space (online, no unmount needed)
sudo xfs_growfs /data
XFS Advantages
XFS has a key architectural advantage for parallel workloads: allocation groups. The filesystem is divided into independent groups, each with its own free space index. Multiple threads can allocate space simultaneously without contention.
# Check allocation groups
xfs_info /data | grep agcount
# agcount=4 means 4 independent allocation groups
When to Choose XFS
- RHEL/Rocky/AlmaLinux default (best tested on these distros)
- Databases (PostgreSQL, MySQL) with large data directories
- Media storage and streaming workloads
- Anything over 16TB per volume
- Parallel I/O workloads (multiple threads writing simultaneously)
- Kubernetes persistent volumes (default for many CSI drivers)
Btrfs: The Modern Copy-on-Write Filesystem
Btrfs brings modern filesystem features to Linux: snapshots, compression, checksums, and built-in RAID. It uses copy-on-write (CoW) semantics, which changes how you think about data management.
Creating a Btrfs Filesystem
# Single device
sudo mkfs.btrfs -L snapvol /dev/sdb1
# Multi-device with RAID1 metadata and RAID0 data
sudo mkfs.btrfs -L raidvol -m raid1 -d raid0 /dev/sdb /dev/sdc
# Check filesystem details
sudo btrfs filesystem show /dev/sdb1
Subvolumes and Snapshots
Subvolumes are Btrfs's killer feature. They're like lightweight partitions within a filesystem.
# Mount the filesystem
sudo mount /dev/sdb1 /mnt/btrfs
# Create subvolumes
sudo btrfs subvolume create /mnt/btrfs/@data
sudo btrfs subvolume create /mnt/btrfs/@backups
# List subvolumes
sudo btrfs subvolume list /mnt/btrfs
# Take a read-only snapshot (instant, zero-cost)
sudo btrfs subvolume snapshot -r /mnt/btrfs/@data /mnt/btrfs/@data-snapshot-$(date +%Y%m%d)
# Take a writable snapshot (for testing changes)
sudo btrfs subvolume snapshot /mnt/btrfs/@data /mnt/btrfs/@data-test
# Delete a snapshot when no longer needed
sudo btrfs subvolume delete /mnt/btrfs/@data-test
Transparent Compression
# Mount with zstd compression (best ratio/speed balance)
sudo mount -o compress=zstd:3 /dev/sdb1 /data
# Check compression ratio
sudo compsize /data
# Shows original size vs compressed size per file type
# Force-compress existing files
sudo btrfs filesystem defragment -r -czstd /data
| Algorithm | Speed | Ratio | Best For |
|---|---|---|---|
lzo | Fastest | ~60% | CPU-bound workloads |
zlib | Slow | ~45% | Archival storage |
zstd:1 | Fast | ~50% | General purpose |
zstd:3 | Medium | ~45% | Default recommendation |
zstd:15 | Slow | ~35% | Maximum compression |
Btrfs Data Integrity
Btrfs checksums all data and metadata, detecting bit rot and silent corruption.
# Scrub the filesystem (verify all checksums — run monthly)
sudo btrfs scrub start /data
# Check scrub status
sudo btrfs scrub status /data
# If corruption is found with RAID1, Btrfs auto-repairs from the good copy
When to Choose Btrfs
- When you need filesystem-level snapshots (database backups, system rollbacks)
- When compression saves significant disk space (logs, text-heavy data)
- When data integrity matters (checksumming detects silent corruption)
- openSUSE/SUSE environments (Btrfs is their default and best supported)
- Home NAS and backup servers
Performance Comparison
Real-world performance depends heavily on the workload. Here are general patterns.
# Quick benchmark with fio (install: apt install fio)
# Sequential write test (simulates log writing, database WAL)
fio --name=seqwrite --rw=write --bs=1M --size=4G --numjobs=4 \
--directory=/data --group_reporting --runtime=60
# Random read/write (simulates database OLTP)
fio --name=randmixed --rw=randrw --bs=4K --size=2G --numjobs=8 \
--directory=/data --group_reporting --runtime=60 --rwmixread=70
# Many small files (simulates mail server, container images)
fio --name=smallfiles --rw=randwrite --bs=4K --size=512M --numjobs=16 \
--nrfiles=1000 --directory=/data --group_reporting --runtime=60
| Workload | ext4 | XFS | Btrfs |
|---|---|---|---|
| Sequential write (large files) | Good | Excellent | Good (CoW overhead) |
| Random 4K IOPS | Good | Good | Fair |
| Many small files | Good | Good | Fair |
| Parallel writes | Good | Excellent | Good |
| Compressed data | N/A | N/A | Excellent savings |
| Snapshot creation | N/A | N/A | Instant |
Practical Recommendations by Workload
| Workload | Recommended | Why |
|---|---|---|
| Web server | ext4 | Simple, reliable, low overhead |
| PostgreSQL/MySQL | XFS | Parallel I/O, large file handling |
| Kubernetes node | XFS | RHEL ecosystem default, CSI driver support |
| Backup server | Btrfs | Snapshots + compression save huge space |
| Log aggregation | XFS | Streaming writes, large files |
| NFS server | XFS or ext4 | Proven stability at scale |
| Development VM | Btrfs | Snapshot before experiments, rollback instantly |
Converting Between Filesystems
You can't convert in-place (except ext3 to ext4). Migration requires backup and reformat.
# Backup, reformat, restore
sudo rsync -aAXv /old-mount/ /backup-location/
sudo umount /dev/sdb1
sudo mkfs.xfs -f -L newvol /dev/sdb1
sudo mount /dev/sdb1 /new-mount
sudo rsync -aAXv /backup-location/ /new-mount/
Filesystem performance is only part of the picture — the network layer often matters more. Next, we explore advanced Linux networking: VLANs, bridges, network namespaces, and the primitives that Kubernetes networking is built on.
