May 2, 2005

CLI Magic: rsnapshot

Author: Joe Barr

A few weeks ago, CLI Magic had a story about rsync for backups. Several readers pointed out higher-level backup utilities based on rsync, and this week we are going to take a look at one of those: rsnapshot. In addition to improving ease-of-use, rsnapshot allows you to keep multiple snapshots in time of your data, local or remote, without requiring the full set to be included in each one. More backups, less space. What could be better? Come on down to the CLI, and let's take a look.Rsnapshot is included in the native distributions of OpenDarwin, the three BSDs, GenToo, and Debian Linux. There is a version for SlackWare available here. If your platform is not in that list, you can download the latest source tarball here and build it yourself. You'll need to have both perl and rsync installed on your system for rsnapshot to work. It's also good idea to have OpenSSH, logger, GNU cp, and GNU du as well.

Editing the config file

We will use the same basic example for rsnapshot that we did with rsync: backing up data from one machine on a LAN to another. Naturally, rsnapshot will be installed on the machine where it will run. That's also the machine where the backup archives will live. Both machines must have rsync and ssh installed. Now to the config file.

Carefully read through /etc/rsnapshot.conf and carefully edit it to fit your needs. You may need to copy the default config file (/etc/rsnapshot.conf.default) to /etc/rsnapshot.conf as your starting point. Don't edit the default config file itself. Save it in pristine condition so you can return to it later if your own config file gets hacked badly.

You won't need to change all the settings in the config. The default values are reasonable. But some are critical and must be tailored to your environment. The first such one defines the snapshot root directory. This tells rsnapshot where to put the snapshot directories as they are created. The default (/var/cache/rsnapshot/) is fine for our purposes.

The interval section of the config file contains the entries which determine how often snapshots are going to be created and how many copies of each will be kept. Uncomment the line or lines for the intervals you want by removing the # at the beginning of a line. Be sure to maintain the order of the intervals. The first interval specified is assumed to be the most frequent. In the example shown below, I went with the default daily and weekly intervals, specifying that the 7 most recent daily archives and the 4 most recent weekly archives be kept.

#interval	hourly	6
interval	daily	7
interval	weekly	4
#interval	monthly	6

Another section of the config file has to do with including and excluding files from the snapshot. Mine looks like the one below. I've excluded all files with a tilde after its name, so that previous versions of text files are not retained. To activate one or more include or exclude lines, simply remove the # and replace the ??? with the desired pattern.

# The include and exclude parameters, if enabled, simply get passed directly
# to rsync. If you have multiple include/exclude patterns, put each one on a
# separate line. Please look up the --include and --exclude options in the
# rsync man page for more details. 
# 
#include	???
#include	???
#exclude	???
exclude	*~

In following sections, you can include/exclude specific files, pass command-line arguments to rsync, and do the same for ssh.

Here's the most important section of the config. It defines what is to be backed up, and where the archive is to be kept. In the example below, rsnapshot will backup two user directories from my desktop machine -- which is defined in /etc/hosts file as desktop -- and store the snapshots

###############################
### BACKUP POINTS / SCRIPTS ###
###############################

# LOCALHOST
#backup /home/ localhost/
#backup /etc/ localhost/
#backup /usr/local/ localhost/
#backup /etc/passwd localhost/
#backup /home/foo/My Documents/ localhost/
#backup /foo/bar/ localhost/ one_fs=1, rsync_short_args=-urltvpog
#backup_script /usr/local/bin/backup_pgsql.sh localhost/postgres/

# EXAMPLE.COM
# for these backup points you will need ssh installed on the
# local machine as well as on the remote host
#
backup root@desktop:/home/susan desktop-susan/
backup root@desktop:/home/warthawg desktop-warthawg/

In the example above, the two user home directories specified will be archived within the rsnapshot root directory specified earlier in directories named destktop-susan and desktop-warthawg.

Testing the config

There are two specific tests you should run before running rsnapshot for the first time with your new configuration file. The first test is simply to check the syntax of the entries in the configuration. To run it, enter rsnapshot configtest at the command line. Rsnapshot will either tell you what it finds wrong or tell you that the syntax is OK.

The second test is for the specific intervals that you have defined. Run it for each one. Since I have daily and weekly intervals specified, I entered these two commands:

rsnapshot -t daily
rsnapshot -t weekly

The output from those commands details exactly what rsnapshot will do. Especially check to make sure that the backup points are as you want them, and that the output is going to be put where you are expecting. it.

The final touch

Last but not least, let's automate the whole thing so we can forget all about our backups until the day comes when we need them. It's easy enough to do using cron. The rsnapshot Howto page suggests using this method for automating the tasks.

Remember that crontab entries provide the means to specify the minute, hour, day of month, month of year, or day of week for a job to be run. So -- as root -- enter crontab -e at the console, and then add the following two lines to root's crontab entries:

0 23 * * *       /usr/local/bin/rsnapshot daily
30 23 * * 7		 /usr/local/bin/rsnapshot weekly

The daily interval will run at 2300 every day. The weekly interval will run at 2330 on the seventh day of each week. Note that the daily interval starts 30 minutes after the hourly, so the first run has time to complete before trying to kick off the second.

That pretty much does it. Once you've got your config file right, and the entries to execute rsnapshot in an automated fashion, you can sleep easier at night knowing you have multiple backups of your vital data, and that you can get back to any single one of them you choose, if needed.

In addition to the excellent documentation available on the rsnapshot site, man rsnapshot provides much more detailed operating instructions. It takes a little work to set this utility up, but once you've done it, you're done. After that, rsnapshot works its magic unattended, giving you multiple backups you can revert to, and using only a fraction of the space that would otherwise be required.

Click Here!