Horcrux is an excellent wrapper around the rsync-based Duplicity, for easily managing automated, encrypted backups to multiple locations.
Horcrux uses what its author, Chris Poole, calls the Voldemort approach, which is multiple backups to multiple locations. If you’re not a Harry Potter fan, a dark wizard or witch can hide a fragment of their soul in a physical object. This is called a Horcrux. Then if the physical body is destroyed, the witch or wizard can be resurrected. Creating multiple Horcruxes is a way to achieve immortality. There is a price to pay, however. Each Horcrux requires an act of murder, and each one diminishes the humanity of its creator.
Fortunately, using Chris Poole’s Horcrux doesn’t require any awful deeds, but merely editing some configuration files. If you’re already a Duplicity user, Horcrux adds the ability to easily send backups to different locations, to encrypt them, and to customize each one if you wish. Horcrux also includes a simple way to test your backups.
Horcrux is a Bash script. Copy it from the download page into a new text file, give it a name, make it executable and put it in a directory that’s in your path. Make it owned by a user with sufficient permissions to read the files you want backed up. Then run it with no options to generate its global configuration file,
~/.horcrux/horcrux.conf. On my server it is owned by
.horcrux/horcrux.conf is in
Let’s take a walk through
source="//" specifies the root directory as the source directory, so you can back up any files in your filesystem. If you prefer, you can narrow it down to a specific source directory:
Always use the full path and don’t forget the trailing slash.
encrypt_key=123456 is the key ID of your GPG encryption key. This is optional, but highly recommended for offsite backups. There are a pasquillion how-tos on creating and managing GPG keys, so I shall not repeat them here. I will give you one great tip: the easy way to generate enough entropy when you’re creating GPG keys is to run this command in a separate terminal:
$ ls -R /
That recursively lists all the files on your filesystem, which will generate more than enough entropy to keep GPG happy.
use_agent=true, because Horcrux will need your GPG passphrase, and
gpg-agent is the best way to manage GPG passwords. (Again, please refer to any of the many good GPG how-tos to learn how to use
remove_n=3 means that you will not have more than three full backups, because if you create a fourth full backup the oldest one will be removed. Use this in conjunction with the
full_if_old= to control how many full backups will be run and saved.
vol_size=250 splits the backup into 250MB volumes. The default is 25MB, which means that large backups create a huge number of files. You might run into filesystem or quota limits with smaller volume sizes. Possible pitfalls with larger volume sizes are unreliable file transfers, and filesystems that don’t handle bigger files well, which I don’t believe is much of an issue these days.
full_if_old=60 determines how often Horcrux will run a full backup. The default is 360 days. On my server a full backup to the remote backup server takes three days, so I have it run a full backup only every 60 days.
Where to Send the Backup
Every backup set needs two files: a
config file and an
exclude file. The
config file contains the remote server destination in a form similar to the standard
rsync syntax, like this:
You can also send backups to local attached media, like this:
Note that there are three slashes–
file:// is the URL, and
/ is the beginning of the filepath. The various formats are spelled out in the Duplicity man page.
You can use
ssh-agent to manage ssh logins, or use password-less public key authentication.
Which Files to Backup
Your file selection is written in an
-exclude file. Give this file a name that helps you remember what this backup does; for example,
dropbox1-exclude. File selection uses the Duplicity syntax. This is a simple example with basic includes and excludes:
+ means include, and no + means exclude. Items are processed in order, so first include
/home/data, then exclude the subdirectory
/etc, and then exclude the subdirectory
/etc/stuff. You can use wildcards to select files by file extensions, for example select
**/* at the end means “Ignore everything else.” A single asterisk is our familiar wildcard that expands to everything except /, and
** is a special globbing pattern that means “everything,” including /. So you could exclude all temporary files like this:
Or include all files with your name in them:
You can specify ranges in square brackets, for example to specify “carla” in either upper- or lowercase is
[Cc]arla. A range of numbers is like
[5-8]. If you want to dive into subdirectories, then
/home/data/images/*/**.jpg matches all subdirectories after
images, and selects all
.jpg files in those directories.
Now that you have your three required configuration files, you can run your first backup. Chris Poole has done a nice job of streamlining and shortening Duplicity’s commands, so you can start your first backup like this:
$ horcrux auto dropbox1
auto option runs a full backup if it does not find an incremental backup set. Good old
rsync is the engine that powers Horcrux and Duplicity, so the first run always takes the longest because it has to copy everything. Then for subsequent backups only changes are uploaded.
You can create multiple backups that go to different locations, including local media. All you need are
config file and
exclude file pairs for each backup. Your backup filenames must be in this format:
backupname can use letters, numbers, and punctuation marks, except for hyphens.
Restoring From Backup
You can restore your entire backup set or specific files and directories. This example restores a single file, and it specifies the backup name and restore directory:
$ horcrux -f myfile restore dropbox1 /restore/directory/myfile
You can go back in time and select an older backup by specifying the date in YYYY/MM/DD format:
$ horcrux -t 2012-08-22 -f myfile restore dropbox1 /restore/directory/myfile
There are several other time specifications, such as n days or weeks ago, which you find by running
horcrux help, or consulting the online documentation.
The simplest way to automate your Horcrux backups is with a cron job, like this:
00 * * * * horcrux auto dropbox1