Linux.com

Feature: Backup & Data Recovery

Back up like an expert with rsync

By Joe 'Zonker' Brockmeier on July 17, 2007 (9:00:00 AM)

Share    Print    Comments   

In the last two months I've been traveling a lot. During the same period my main desktop computer went belly up. I would have been in trouble without rsync at my disposal -- but thanks to my regular use of this utility, my data (or most of it, anyway) was already copied offsite just waiting to be used. It takes a little time to become familiar with rsync, but once you are, you should be able to handle most of your backup needs with just a short script.

What's so great about rsync? First, it's designed to speed up file transfer by copying the differences between two files rather than copying an entire file every time. For example, when I'm writing this article, I can make a copy via rsync now and then another copy later. The second (and third, fourth, fifth, etc.) time I copy the file, rsync copies the differences only. That takes far less time, which is especially important when you're doing something like copying a whole directory offsite for daily backup. The first time may take a long time, but the next will only take a few minutes (assuming you don't change that much in the directory on a daily basis).

Another benefit is that rsync can preserve permissions and ownership information, copy symbolic links, and generally is designed to intelligently handle your files.

You shouldn't need to do anything to get rsync installed -- it should be available on almost any Linux distribution by default. If it's not, you should be able to install it from your distribution's package repositories. You will need rsync on both machines if you're copying data to a remote system, of course.

When you're using it to copy files to another host, the rsync utility typically works over a remote shell, such as Secure Shell (SSH) or Remote Shell (RSH). We'll work with SSH in the following examples, because RSH is not secure and you probably don't want to be copying your data using it. It's also possible to connect to a remote host using an rsync daemon, but since SSH is practically ubiquitous these days, there's no need to bother.

Getting to know rsync

The basic syntax for rsync is simple enough -- just run rsync [options] source destination to copy the file or files provided as the source argument to the destination.

So, for example, if you want to copy some files under your home directory to a USB storage device, you might use rsync -a /home/user/dir/ /media/disk/dir/. By the way, "/home/user/dir/" and "/home/usr/dir" are not the same thing to rsync. Without the final slash, rsync will copy the directory in its entirety. With the trailing slash, it will copy the contents of the directory but won't recreate the directory. If you're trying to replicate a directory structure with rsync, you should omit the trailing slash -- for instance, if you're mirroring /var/www on another machine or something like that.

In this example, I included the archive option (-a), which actually combines several rsync options. It combines the recursive and copy symlinks options, preserves group and owner, and generally makes rsync suitable for making archive copies. Note that it doesn't preserve hardlinks; if you want to preserve them, you will need to add the hardlinks option (-H).

Another option you'll probably want to use most of the time is verbose (-v), which tells rsync to report lots of information about what it's doing. You can double and triple up on this option -- so using -v will give you some information, using -vv will give more, and using -vvv will tell you everything that rsync is doing.

rsync will move hidden files (files whose names begin with a .) without any special options. If you want to exclude hidden files, you can use the option --exclude=".*/". You can also use the --exclude option to prevent copying things like Vim's swap files (.swp) and automatic backups (.bak) created by some programs.

Making local copies

Suppose you have an external USB or FireWire drive, and you want to copy data from your home directory to your external drive. A good way to do this would be to keep all your important data under a single top-level directory and then copy it to a backup directory on the external drive using a command like:

rsync -avh /home/usr/dir/ /media/disk/backup/

If you want to make sure that local files you've deleted since the last time you ran rsync are deleted from the external system as well, you'll want to add the --deleted option, like so:

rsync -avh --delete /home/user/dir/ /media/disk/backup

Be very careful with the delete option; with it, you can whack a bunch of files without meaning to. In fact, while you're getting used to rsync, it's probably a good idea to use the --dry-run option with your commands to run through the transfer first, without actually copying or synching files. If you do start an rsync transfer and realize that you've botched the command in some way that might result in the destruction of data, press Ctrl-c immediately to terminate the transfer. Some files may be gone, but you may be able to save the rest.

Making remote copies

What if you want to copy files offsite to a remote host? No problem -- all you need to do is add the host and user information. So, for instance, if you want to copy the same directory to a remote host, you'd use:

rsync -avhe ssh --delete /home/user/dir/ user@remote.host.com:dir/

If you want to know how fast the transfer is going, and how much remains to be copied, add the --progress option:

rsync --progress -avhe ssh --delete /home/user/dir/ user@remote.host.com:dir/

If you don't want to be prompted for a password each time rsync makes a connection -- and you don't -- make sure that you have rsync set up to log in using an SSH key rather than a password. To do this, create an SSH key on the local machine using ssh-keygen -t dsa, and press Enter when prompted for a passphrase. After the key is created, use ssh-copy-id -i .ssh/id_dsa.pub user@remote.host.com to copy the public key to the remote host.

What if you need to bring back some of the files you copied using rsync? Use the following syntax:

rsync -avze ssh remote.host.com:/home/user/dir/ /local/path/

The z option compresses data during the transfer. If the file you are copying exists on the local host, then rsync will just leave it alone -- the same as if you were copying files to a remote host.

Wrapping it up with a script

Once you've figured out what directory or directories you want to sync up, and you've gotten the commands you need to sync everything, it's easy to wrap it all up with a simple script. Here's a short sample:

rsync --progress -avze ssh --delete /home/user/bin/ user@remote.host.com:bin/
rsync --progress -avze ssh --delete /home/user/local/data/ user@remote.host.com:local/data/
rsync --progress -avze ssh --delete /home/user/.tomboy/ user@remote.host.com:/.tomboy/

Use the --progress option if you're going to be running rsync interactively. If not, there's no need for it.

If you look at the rsync man page, you can easily be confused. However, after a little practice with rsync, you'll find that it's not hard to set up rsync jobs that will help you prepare for the day that your disk drive craps out and you need access to your data right away.

Share    Print    Comments   

Comments

on Back up like an expert with rsync

Note: Comments are owned by the poster. We are not responsible for their content.

Back up like an expert with rsync

Posted by: Anonymous [ip: 91.11.36.161] on July 22, 2007 12:58 PM
Well, your backup method is an interesting idea, but should be improved, since you lack several features a backup should have:

  1. You should be able to restore your data not only on one point back in time.

  2. You should be able to say which files have changed.



Good news is, that there is a quite simple solution (zero configuration!) that's based on rsync's library and brings all that improvements: http://www.nongnu.org/rdiff-backup/features.html

#

Back up like an expert with rsync

Posted by: Anonymous [ip: 59.95.27.192] on July 22, 2007 02:59 PM
How do you update /correct an iso?

#

Back up like an expert with rsync

Posted by: Anonymous [ip: 68.5.130.13] on July 22, 2007 04:31 PM
Great article. rsync is one of my favorite tools. It can also be used to back up Windoze boxes using CygWin. I created a small batch file on my laptop and dropped a link to it on my Desktop. Once a day, I click the link and my laptop gets backed up to a remote server via ssh. Be sure to spend some time with the rsync docs. There is so much more rsync can do that presented here.

#

Back up like an expert with rsync

Posted by: Anonymous [ip: 194.106.138.240] on July 23, 2007 02:09 PM
Thanks!! I'm new to rsync and this is a great help. I'm backing up my desktop to my kurobox at this moment.

#

Back up like an expert with rsync

Posted by: Anonymous [ip: 65.12.164.207] on July 24, 2007 04:39 AM
You might want to take a look at dirvish (http://www.dirvish.org/). It is an rsync-based backup that is very flexible and efficient. I briefly skimmed the rdiff-backup site and they seem very familiar though I didn't see a mention there of auto-expiring old backups and indexing (dirvish has both).

#

Re: Back up like an expert with rsync

Posted by: wongy on July 25, 2007 07:26 AM
rdiff-backup let's you configure how many days worth of snapshots to keep. There is also a command to list all the available versions of a particular file.

#

Back up like an expert with rsync

Posted by: Anonymous [ip: 87.194.123.80] on July 26, 2007 02:44 PM
Thanks for the article. rsync + cron will hopefully make me one happy guy.

#

rsync -h option???

Posted by: Anonymous [ip: 195.229.242.57] on July 29, 2007 04:29 PM
What gives with the -h option isn't it supposed to be for displaying help?
For example in the article there is
-avhe and -avh
I am missing something here? Or is the -h a typo?

#

Re: rsync -h option???

Posted by: Anonymous [ip: 24.63.38.250] on August 06, 2007 01:59 PM
-H is to preserve hardlinks.

#

Re: rsync -h option???

Posted by: Anonymous [ip: 150.125.173.96] on August 08, 2007 08:58 PM
there is no standard, while -h is often for help, it is not always

#

Re(1): rsync -h option???

Posted by: Anonymous [ip: 218.162.117.174] on August 11, 2007 04:23 AM
It means "human readable" here.
See "man rsybc"

#

Back up like an expert with rsync

Posted by: Anonymous [ip: 72.229.252.171] on July 30, 2007 09:21 PM
i tried a simple bkup remotely and ownershi are not preserved, is this becuase users dont exist in both places?

#

Add rsync snapshots! Back up like an expert with rsync

Posted by: Anonymous [ip: 193.191.206.94] on August 02, 2007 01:50 PM
http://www.mikerubel.org/computers/rsync_snapshots/ describes a feature called snapshots in rsync

In addition to your explaination, this adds a wayback machine in a way that you can have different snapshots over time.

#

Re: Add rsync snapshots! Back up like an expert with rsync

Posted by: Anonymous [ip: 71.43.20.172] on August 08, 2007 12:46 AM
I'm using Rubel's approach to back up a dozen remote partitions to a central backup server, with snapshots going back a week. 16 GB of data syncs in an average of 55 minutes. On another 300 boxes I'm rsyncing the data partition to a local backup partition, again with snapshots going back a week. This has proven to be very valuable, particularly since we are no longer tied to hundreds of unreliable tape drives.

Rsync has changed the way we work. I can't think of a better or more useful piece of software.

#

Back up like an expert with rsnapshot

Posted by: Anonymous [ip: 41.240.205.166] on August 02, 2007 10:51 PM
You ain't seen nothing yet if you haven't use rsnapshot.

See: http://www.rsnapshot.org/

#

Back up like an expert with rsync

Posted by: Anonymous [ip: 74.132.138.151] on August 13, 2007 04:26 PM
"/home/user/dir/" and "/home/usr/dir" are not the same thing.

one is usr and one is user! (besides the slash)

#

Common rsync pitfalls

Posted by: Anonymous [ip: 212.204.128.10] on September 05, 2007 08:22 PM
When following the above instructions, you may be very unpleasantly surprised once you find out that some of your precious data and meta-data isn't as backed up as you think it is. My buddy halfgaar has written extensively about what to do if you want to <a href="http://halfgaar-dev/backing-up-unix">make a Linux backup</a> that is truly trustworthy.

- Rowan

#

Back up like an expert with rsync

Posted by: Anonymous [ip: 80.60.82.72] on October 05, 2007 10:53 AM
Quote: "/home/user/dir/" and "/home/usr/dir" are not the same thing to rsync.

Very try, remove the 'e' from 'user' in /home/user.. Should be /home/usr/dir/, obviously.

#

Back up like an expert with rsync

Posted by: Anonymous [ip: 202.124.142.129] on October 10, 2007 07:02 AM
"/home/user/dir/" and "/home/usr/dir" is really different!!!

#

Back up like an expert with rsync

Posted by: Anonymous [ip: 58.8.166.239] on October 25, 2007 05:22 AM
Excellent article, exactly what I was looking for. I'm a Linux n00b (Ubuntu 7.10), and managed to port my backup scripts from Windows easily, thanks to the above.

#

You don't need -e ssh anymore

Posted by: Anonymous [ip: 66.92.218.124] on December 04, 2007 07:05 PM
Unless you have an ancient version it isn't necssary
to specify the -e option for SSH (the 'e ssh' in 'rsync -avze ssh').
SSH is the default since rsync 2.6.0 (1 Jan 2004).

#

Back up like an expert with rsync

Posted by: Anonymous [ip: 202.164.156.93] on December 20, 2007 12:55 PM
how to schedule backup with rsync weekly, default is working normally

#

Back up like an expert with rsync

Posted by: Anonymous [ip: 117.196.134.29] on February 22, 2008 04:54 AM
This is a very good tutorial on rsync thanks for the effotr

Mabin

#

This story has been archived. Comments can no longer be posted.



 
Tableless layout Validate XHTML 1.0 Strict Validate CSS Powered by Xaraya