Backing up your Linux desktop with rsync

249

Author: Brice Burgess

Rsync is a command line utility traditionally used in synchronizing files
between two computers, but rsync can also be used as an effective backup tool. This free and powerful tool is simple enough for anyone to use on their Linux desktop.This article explains how to use rsync to backup your computer to a drive attached to your system. You can use a removable drive, such as an external USB hard drive, so that you can store the backups in a safe place away from your working environment.

First, make sure you have rsync by entering rsync --version at the command line. If you see rsync version 2.X.X protocol version X, you have it. If you see “command not found” or a similar message, you need to download and install rsync. Use your distribution’s package management system to do this, or else download and install the source from the rsync Web site. Make sure your version is greater than 2.6.0.

Now it’s time to consider what to back up, to where, and when?

What should be backed up? Do you want to run a full system or a partial system backup? A full system backup creates a second copy of everything on your hard drive. This has the
advantage of providing a means to quickly restore your system to the exact state it was in when you made the backup. Full system backups take a long time to complete, take up a lot of disk space, and are often unnecessary. When you run full system backups, make sure to use rsync’s --exclude parameter. Certain directories, such as /proc, should not be backed up.
See the backup.sh script below as an example.

Partial system backups are faster and more space-efficient, because you copy only important hand-selected data. For instance, you may want to backup only the /home directory, which contains users’ documents, music, and program settings. The operating system files, such as those under
/usr (programs) and /var (log files, email, etc.) can be easily reinstalled and don’t need to be backed up.

Where should it be backed up? Your imagination is the limit when it comes to rsync’s backup destination options. The scope of this article, however, is limited to local disk drives. Ideally you’ll want to store your backups on a separate disk so that they’ll be accessible if your hard drive fails. Once you create a backup, you can copy it to CDs, tapes, or other removable storage to increase chances of recovery.

When should it be backed up? Automated daily backups are a good choice for most Linux desktop scenarios. You can use Linux’s built in scheduler, the cron daemon, with shell scripts to automate your backups.

Using rsync

The basic implementation of rsync is: rsync -a source/target/. This command copies the source directory to the target as if you were executing cp -a source/. target/. Unlike cp, rsync uses the rsync algorithm to check for differences between source and destination files. Since it copies only new changes, a technique known as incremental backup, rsync provides a very fast method for updating your backups.

Making exact copies using the --delete flag. You can apply the --delete flag when making system backups, which causes rsync to delete any files found in the target that are not present in the source. This ensures that the target is an exact copy of the source, so that if you delete an unwanted document, it is also removed from your backup. Rsync preserves files found in the target and not in the source by default, allowing for multiple sources to be added to a single target destination. To get around this behavior, use a command like the following:
rsync -a --delete source/ target/

Keeping multiple backups. It is a good idea to keep a few days’ worth of backups so that you can return to a particular day if necessary. You can do this by rotating the oldest backup to the current one and updating it using rsync. The following code demonstrates a three-day backup rotation:

#!/bin/sh
# Author: Brice Burgess - bhb@iceburg.net
# backup.sh -- backup to a local drive using rsync

# Directories to backup. Separate with a space. Exclude trailing slash!
SOURCES="/home/wendy /home/daisy /var/mail"

# Directory to backup to. This is where your backup(s) will be stored.
# Exclude trailing slash!
TARGET="/mnt/usb-harddrive/backup"

# Your EXCLUDE_FILE tells rsync what NOT to backup. Leave it unchanged if you want
# to backup all files in your SOURCES. If performing a FULL SYSTEM BACKUP, ie.
# Your SOURCES is set to "/", you will need to make use of EXCLUDE_FILE.
# The file should contain directories and filenames, one per line.
# An example of a EXCLUDE_FILE would be:
# /proc/
# /tmp/
# /mnt/
# *.SOME_KIND_OF_FILE

EXCLUDE_FILE="/path/to/your/exclude_file.txt"

# Comment out the following line to disable verbose output
VERBOSE="-v"
###########################

if [ ! -x $TARGET ]; then
  echo "Backup target does not exist or you don't have permission!"
  echo "Exiting..."
  exit 2
fi

echo "Verifying Sources..." 
for source in $SOURCES; do
	echo "Checking $source..."
	if [ ! -x $source ]; then
     echo "Error with $source!"
     echo "Directory either does not exist, or you do not have proper permissions."
     exit 2
   fi
done

if [ -f $EXCLUDE_FILE ]; then
EXCLUDE="--exclude-from=$EXCLUDE_FILE"
fi

echo "Sources verified. Running rsync..."
for source in $SOURCES; do

  # Create directories in $TARGET to mimick source directory hiearchy 
  if [ ! -d $TARGET/$source ]; then
    mkdir -p $TARGET/$source
  fi
  
  rsync $VERBOSE --exclude=$TARGET/ $EXCLUDE -a --delete $source/ $TARGET/$source/

done

exit 0

Change the $TARGET variable to the path where you’d like your backups to be saved. Save the script (as backup.sh) to your computer and make it executable with the command chmod +x backup.sh.

You’re now ready to make your first backup. Type ./backup.sh to start the process. The script will take a long time to complete the first time you run it, because rsync must make a copy of each file rather than update just changed files. Later runs will complete much faster. If you notice something is wrong, press Ctrl-C to stop the process. Upon completion of the script, you should have a replica of your $SOURCES in your $TARGET.

Automating the process. Assuming backup.sh ran successfully, and that you now have a copy of your important files in the $TARGET directory, it is time to automate the process. We’ll use Linux’s built-in scheduler, the cron daemon, to do this. The cron daemon uses “crontab” files to schedule tasks. The system’s main crontab file can be accessed by becoming the superuser (either by logging in as root or typing su at the command line) and executing crontab -e.

You’ll want to schedule a time for your backup.sh to execute. Crontab syntax is:
[minute] [hour] [day] [month] [dayofweek] [command]

Thus, adding the line:
0 4 * * *
/path/to/backup.sh

will execute backup.sh at 4:00am every
day. When you’re finished adding the line, save the file and exit.

That’s all there is to it. Rsync is a very powerful tool, and you should pat yourself on the back for applying some of its potential. In the future we’ll cover how to backup to a remote machine, show examples on how to keep multiple backups in rotation, and even run rsync within Microsoft Windows. In the meantime, check out Mike Rubel’s excellent resource on rsync to learn how to perform daily and even hourly backups.

Update 11/04/2004: I’ve modified the above script for multiple backup rotations. The modifications will keep a designated number of backups in the target directory named after the date they were executed (YYYY-MM-DD_Hour-Minute). Here’s the modified script:

#!/bin/sh
# Author: Brice Burgess - bhb@iceburg.net
# multi_backup.sh -- backup to a local drive using rsync. 
#         Uses hard-link rotation to keep multiple backups.

# Directories to backup. Seperate with a space. Exclude trailing slash!
SOURCES="/home/wendy /home/daisy /var/mail"

# Directory to backup to. This is where your backup(s) will be stored. No spaces in names!
# :: NOTICE :: -> Make sure this directory is empty or contains ONLY backups created by
#	                        this script and NOTHING else. Exclude trailing slash!
TARGET="/mnt/usb-harddrive/backup"

# Set the number of backups to keep (greater than 1). Ensure you have adaquate space.
ROTATIONS=3

# Your EXCLUDE_FILE tells rsync what NOT to backup. Leave it unchanged if you want
# to backup all files in your SOURCES. If performing a FULL SYSTEM BACKUP, ie.
# Your SOURCES is set to "/", you will need to make use of EXCLUDE_FILE.
# The file should contain directories and filenames, one per line.
# A good example would be:
# /proc
# /tmp
# *.SOME_KIND_OF_FILE
EXCLUDE_FILE="/path/to/your/exclude_file.txt"

# Comment out the following line to disable verbose output
VERBOSE="-v"

#######################################
########DO_NOT_EDIT_BELOW_THIS_POINT#########
#######################################

# Set name (date) of backup. 
BACKUP_DATE="`date +%F_%H-%M`"

if [ ! -x $TARGET ]; then
  echo "Backup target does not exist or you don't have permission!"
  echo "Exiting..."
  exit 2
fi

if [ ! $ROTATIONS -gt 1 ]; then
  echo "You must set ROTATIONS to a number greater than 1!"
  echo "Exiting..."
  exit 2
fi

#### BEGIN ROTATION SECTION #### 

BACKUP_NUMBER=1
# incrementor used to determine current number of backups

# list all backups in reverse (newest first) order, set name of oldest backup to $backup
# if the retention number has been reached. 
for backup in `ls -dXr $TARGET/*/`; do
	if [ $BACKUP_NUMBER -eq 1 ]; then
		NEWEST_BACKUP="$backup"
	fi
	
	if [ $BACKUP_NUMBER -eq $ROTATIONS ]; then
		OLDEST_BACKUP="$backup"
		break
	fi
	
	let "BACKUP_NUMBER=$BACKUP_NUMBER+1"
done

# Check if $OLDEST_BACKUP has been found. If so, rotate. If not, create new directory for this backup.
if [ $OLDEST_BACKUP ]; then
  # Set oldest backup to current one
  mv $OLDEST_BACKUP $TARGET/$BACKUP_DATE
else
	mkdir $TARGET/$BACKUP_DATE
fi

# Update current backup using hard links from the most recent backup 
if [ $NEWEST_BACKUP ]; then
  cp -al $NEWEST_BACKUP. $TARGET/$BACKUP_DATE
fi
#### END ROTATION SECTION #### 
 

# Check to see if rotation section created backup destination directory 
if [ ! -d $TARGET/$BACKUP_DATE ]; then
  echo "Backup destination not available. Make sure you have write permission in TARGET!"
  echo "Exiting..."
  exit 2
fi 

echo "Verifying Sources..." 
for source in $SOURCES; do
	echo "Checking $source..."
	if [ ! -x $source ]; then
     echo "Error with $source!"
     echo "Directory either does not exist, or you do not have proper permissions."
     exit 2
   fi
done

if [ -f $EXCLUDE_FILE ]; then
	EXCLUDE="--exclude-from=$EXCLUDE_FILE"
fi

echo "Sources verified. Running rsync..."
for source in $SOURCES; do

  # Create directories in $TARGET to mimick source directory hiearchy 
  if [ ! -d $TARGET/$BACKUP_DATE/$source ]; then
    mkdir -p $TARGET/$BACKUP_DATE/$source
  fi
  
  rsync $VERBOSE --exclude=$TARGET/ $EXCLUDE -a --delete $source/ $TARGET/$BACKUP_DATE/$source/

done

exit 0

Category:

  • Backup & Data Recovery