Skip to main content

Linux Backup & Disaster Recovery — rsync, tar, and Automation

· 7 min read
Goel Academy
DevOps & Cloud Learning Hub

It's not a matter of IF your disk fails -- it's WHEN. RAID is not a backup. Snapshots are not a backup. That "I'll set up backups next week" has been on your to-do list for six months. Today we build a real backup strategy that you can deploy in production and actually trust.

The 3-2-1 Backup Rule

Before touching any commands, understand the rule that has saved countless organizations:

RuleMeaningExample
3 copiesKeep 3 copies of your dataOriginal + 2 backups
2 media typesStore on 2 different storage typesLocal disk + cloud/tape
1 offsiteAt least 1 copy in a different locationS3, remote server, different DC

If your backups are on the same disk as your data, you don't have backups. Let's fix that.

rsync — The Workhorse of Linux Backups

rsync is the most important backup tool in Linux. It transfers only changed bytes, supports compression, works over SSH, and preserves permissions.

Basic rsync Usage

# Basic local sync (archive mode preserves everything)
rsync -avh /var/www/ /backup/www/

# What the flags mean:
# -a archive (preserves permissions, timestamps, symlinks, owner)
# -v verbose
# -h human-readable sizes

# Dry run first -- see what WOULD happen without changing anything
rsync -avhn /var/www/ /backup/www/

# Delete files on destination that no longer exist on source
rsync -avh --delete /var/www/ /backup/www/

# Exclude patterns
rsync -avh --exclude='*.log' --exclude='.cache' --exclude='node_modules' \
/home/deploy/ /backup/deploy-home/

Remote Backups Over SSH

This is where rsync really shines -- incremental backups to a remote server:

# Push backup to remote server
rsync -avhz -e "ssh -p 22 -i ~/.ssh/backup_key" \
/var/www/ backup@remote-server:/backups/www/

# Pull backup from remote server
rsync -avhz -e "ssh -p 22" \
backup@remote-server:/var/lib/postgresql/ /local-backup/postgres/

# Bandwidth limit (in KB/s) -- don't saturate your production network
rsync -avhz --bwlimit=5000 /data/ backup@offsite:/backups/data/

# Show progress for large transfers
rsync -avhz --progress --stats /data/ backup@remote:/backups/data/

This technique creates daily snapshots that look like full backups but only use disk space for changed files:

#!/bin/bash
# incremental-backup.sh — daily snapshots with hardlinks
BACKUP_DIR="/backups/daily"
SOURCE="/var/www"
DATE=$(date +%Y-%m-%d)
LATEST="$BACKUP_DIR/latest"

# Create backup using hardlinks to previous backup
rsync -avh --delete \
--link-dest="$LATEST" \
"$SOURCE/" "$BACKUP_DIR/$DATE/"

# Update the 'latest' symlink
rm -f "$LATEST"
ln -s "$BACKUP_DIR/$DATE" "$LATEST"

# Remove backups older than 30 days
find "$BACKUP_DIR" -maxdepth 1 -type d -mtime +30 -exec rm -rf {} +

echo "Backup completed: $BACKUP_DIR/$DATE"

Each daily directory looks like a complete copy, but unchanged files are hardlinked to the previous backup -- using almost zero extra space.

tar — Compressed Archives

tar is ideal for creating portable, compressed backups that you can store anywhere.

# Create a compressed backup with timestamp
tar -czf /backups/www-$(date +%Y%m%d-%H%M%S).tar.gz \
--exclude='*.log' \
--exclude='.git' \
--exclude='node_modules' \
/var/www/

# List contents without extracting
tar -tzf /backups/www-20250615-140000.tar.gz | head -20

# Extract to a specific directory
tar -xzf /backups/www-20250615-140000.tar.gz -C /restore/

# Use better compression with xz (slower but smaller)
tar -cJf /backups/database-$(date +%Y%m%d).tar.xz /var/lib/mysql/

# Create split archives for large backups (each part 1GB)
tar -czf - /data/ | split -b 1G - /backups/data-$(date +%Y%m%d).tar.gz.part_

Compression Comparison

FormatFlagSpeedCompression RatioBest For
gzip-czfFastGoodDaily backups
bzip2-cjfMediumBetterWeekly archives
xz-cJfSlowBestLong-term storage
zstd--zstdVery fastGoodLarge datasets

dd — Full Disk Cloning

When you need an exact bit-for-bit copy of a disk or partition:

# Clone entire disk (DANGEROUS — double check device names!)
sudo dd if=/dev/sda of=/dev/sdb bs=64K status=progress conv=sync,noerror

# Create disk image file
sudo dd if=/dev/sda1 of=/backups/sda1-$(date +%Y%m%d).img bs=64K status=progress

# Compressed disk image (saves massive space)
sudo dd if=/dev/sda1 bs=64K status=progress | gzip > /backups/sda1.img.gz

# Restore from compressed image
gunzip -c /backups/sda1.img.gz | sudo dd of=/dev/sda1 bs=64K status=progress

Warning: dd will happily overwrite any disk you point it at. The classic mistake is swapping if= and of=. Always double-check with lsblk before running dd.

Automated Backup Script

Here's a production-grade backup script that combines everything:

#!/bin/bash
# production-backup.sh — automated backup with logging and verification
set -euo pipefail

# Configuration
BACKUP_ROOT="/backups"
REMOTE_HOST="backup@offsite-server"
REMOTE_PATH="/offsite-backups/$(hostname)"
LOG="/var/log/backup.log"
RETENTION_DAYS=30
DATE=$(date +%Y-%m-%d_%H%M)

log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG"; }

log "=== Starting backup ==="

# 1. Database dump
log "Dumping PostgreSQL..."
sudo -u postgres pg_dumpall | gzip > "$BACKUP_ROOT/db/postgres-$DATE.sql.gz"

# 2. Application files (incremental)
log "Syncing application files..."
rsync -ah --delete \
--link-dest="$BACKUP_ROOT/app/latest" \
--exclude='*.log' --exclude='.cache' --exclude='tmp/' \
/var/www/ "$BACKUP_ROOT/app/$DATE/"
ln -sfn "$BACKUP_ROOT/app/$DATE" "$BACKUP_ROOT/app/latest"

# 3. Configuration files
log "Backing up configs..."
tar -czf "$BACKUP_ROOT/configs/etc-$DATE.tar.gz" \
/etc/nginx/ /etc/systemd/system/ /etc/crontab /etc/ssh/sshd_config

# 4. Push to offsite (copy 2 — different location)
log "Pushing to offsite..."
rsync -ahz -e "ssh -i /root/.ssh/backup_key" \
"$BACKUP_ROOT/" "$REMOTE_HOST:$REMOTE_PATH/"

# 5. Clean old backups
log "Cleaning backups older than $RETENTION_DAYS days..."
find "$BACKUP_ROOT/db" -name "*.sql.gz" -mtime +$RETENTION_DAYS -delete
find "$BACKUP_ROOT/app" -maxdepth 1 -type d -mtime +$RETENTION_DAYS -exec rm -rf {} +

# 6. Verify latest backup
BACKUP_SIZE=$(du -sh "$BACKUP_ROOT/app/$DATE" | cut -f1)
log "Backup size: $BACKUP_SIZE"
log "=== Backup completed successfully ==="

Scheduling with Cron

# Edit root's crontab
sudo crontab -e

# Daily backup at 2 AM
0 2 * * * /usr/local/bin/production-backup.sh >> /var/log/backup.log 2>&1

# Weekly full tar archive on Sundays at 3 AM
0 3 * * 0 tar -czf /backups/weekly/full-$(date +\%Y\%m\%d).tar.gz /var/www/ /etc/

# Monthly database archive on the 1st
0 4 1 * * /usr/local/bin/monthly-db-archive.sh

# Verify cron is scheduled
sudo crontab -l

Testing Your Restores

A backup that hasn't been tested is not a backup. Schedule regular restore tests:

#!/bin/bash
# test-restore.sh — verify backup integrity
RESTORE_DIR="/tmp/restore-test-$(date +%s)"
mkdir -p "$RESTORE_DIR"

echo "Testing rsync backup restore..."
rsync -avh /backups/app/latest/ "$RESTORE_DIR/app/"
FILE_COUNT=$(find "$RESTORE_DIR/app" -type f | wc -l)
echo "Restored $FILE_COUNT files"

echo "Testing database backup restore..."
DB_BACKUP=$(ls -t /backups/db/*.sql.gz | head -1)
gunzip -c "$DB_BACKUP" | head -5 # Verify it's valid SQL

echo "Testing tar backup integrity..."
TAR_BACKUP=$(ls -t /backups/weekly/*.tar.gz | head -1)
tar -tzf "$TAR_BACKUP" > /dev/null && echo "Archive is valid" || echo "CORRUPT!"

# Cleanup
rm -rf "$RESTORE_DIR"
echo "Restore test completed"

Run this monthly. Put it in cron. Set an alert if it fails. The worst time to discover your backups are broken is during an actual disaster.

Backup Monitoring

Add these checks to your monitoring system (Prometheus, Nagios, or a simple script):

# Check backup freshness — alert if latest backup is older than 25 hours
LATEST_BACKUP=$(stat -c %Y /backups/app/latest 2>/dev/null || echo 0)
CURRENT_TIME=$(date +%s)
AGE_HOURS=$(( (CURRENT_TIME - LATEST_BACKUP) / 3600 ))

if [ "$AGE_HOURS" -gt 25 ]; then
echo "CRITICAL: Latest backup is ${AGE_HOURS} hours old!"
# Send alert to Slack/PagerDuty
exit 2
elif [ "$AGE_HOURS" -gt 20 ]; then
echo "WARNING: Latest backup is ${AGE_HOURS} hours old"
exit 1
else
echo "OK: Latest backup is ${AGE_HOURS} hours old"
exit 0
fi

Your data is protected. Next, let's stop managing servers one at a time -- in the next post, we'll automate entire fleets with Ansible.