Author: Ben Martin
The rsync utility is smart enough to send only enough bytes of a changed file to a remote system to enable the remote file to become identical to the local file. When that information is sensitive, using rsync over SSH protects files while in transit.To protect the files when they are on the server you might first encrypt
them with GPG. But the manner in which GPG encrypts slightly changed
files foils rsync’s efficiency.rsyncrypto allows you to encrypt your files while still allowing you to leverage the speed of rsync.
One of the aims of encryption is to try to make any change to the unencrypted file completely modify how the encrypted file appears. This means that somebody who has access to a series of encrypted files gets little information about how the little changes you might make to the unencrypted file are affecting the encrypted file over time. The downside is that such security causes the encryption program to modify most of the encrypted file. If you then use rsync to copy such a file to the remote server, it will have to send almost the complete file to the remote server each time.
The goal of rsyncrypto is to encrypt files in such a manner that only a slight and controlled amount of security is sacrificed in order to make rsync able to send the encrypted files much quicker. It aims to leak no more than 20 bits of aggregated information per 8KB of plaintext file.
For an example of an information leak, suppose you have an XML file and you use rsyncrypto to copy the file to a remote host. Then you change a single XML attribute and use rsyncrypto to copy the updates across. Now suppose an attacker captured the encrypted versions in transit, and thus has copies of both the encrypted file before the change and after the change. The first thing they learn is that only the first 8KB of the file changed, because that is all that was sent the second time. If they can speculate what sort of file the unencrypted file was (for example, an XML file) then they can try to use that guess in an attempt to recover information.
Rsyncrypto encrypts parts of the file independently, thus keeping any changes you make to a single block of the file local to that block in the encrypted version. If you’re protecting a collection of personal files from a possible remote system compromise, such a tradeoff in security might be acceptable. On the other hand, if you cannot allow any information leaks, then you’ll have to accept that the whole encrypted file will change radically each time you change the unencrypted file. If that’s the case, using rsync on GnuPG-encrypted files might suit your needs.
On a Fedora 8 machine, you have to download both rsyncrypto and the dependency argtable2 and install them using the standard
./configure; make; make install combination, starting with argtable2.
rsyncrypto is designed to be used as a presync option to rsync. That is, you first use rsyncrypto on the plain unencrypted files to obtain an encrypted directory tree, and you then run rsync to send that encrypted tree to a remote system. The following command syntax shows a template for directory encryption and decryption:
# to encrypt rsyncrypto -r srcdir /tmp/encrypted srcdir.keys mykey.crt # to decrypt rsyncrypto -d -r /tmp/encrypted srcdir srcdir.keys mykey.crt
The keys and certificates referenced in these commands are generated by OpenSSL, as we’ll see in a moment. In the commands, the srcdir is encrypted and sent to /tmp/encrypted with the individual keys used to encrypt each file in srcdir saved into srcdir.keys. The mykey.crt is a certificate that is used to protect all the keys in srcdir.keys. If you still have all the keys, you can use your certificate in the decryption operation to obtain the plaintext files again. If you lose srcdir.keys, all is not lost, but you must use the private key for mykey.crt to regain the encrypted keys that are also stored in /tmp/encrypted.
The following is a full one-way sync to a remote server using both rsyncrypto and rsync to obtain an encrypted backup on a remote machine. The example first generates a master key and certificate using OpenSSL, then makes an encrypted backup of ~/foodir onto the remote machine v8tsrv:
$ mkdir ~/rsyncrypto-keys $ cd ~/rsyncrypto-keys $ openssl req -nodes -newkey rsa:1536 -x509 -keyout rckey.key -out rckey.crt $ cd ~ $ mkdir foodir $ date >foodir/df1.txt $ date >foodir/df2.txt $ rsyncrypto -r foodir /tmp/encrypted foodir.keys ~/rsyncrypto-keys/rckey.crt $ rsync -av /tmp/encrypted ben@v8tsrv:~
In order to test the speed gain of using rsyncrypto as opposed to using other encryption with rsync, I used /dev/urandom to create a file of random bytes, encrypted it with both rsyncrypto and GnuPG, and rsynced both of these to a remote system using rsync. I then modified the plaintext file, encrypted the file again, and synced the encrypted file with the remote system. In this case, I modified 6KB of data at an offset of 17KB into the file using dd and left all the other data intact. The final rsync commands show that the rsyncrypto-encrypted tree only needed to send 58,102 bytes, whereas the GnuPG-encrypted file required the entire file to be sent to the remote system:
$ cd ~/foodir $ rm -f * $ dd if=/dev/urandom of=testfile.random bs=1024 count=500 512000 bytes (512 kB) copied, 0.088045 s, 5.8 MB/s $ cd ~ $ rsyncrypto -r foodir foodir.rcrypto foodir.keys ~/rsyncrypto-keys/rckey.crt $ ls -l foodir.rcrypto/ -rw-r--r-- ... 502K 2008-01-08 19:59 testfile.random $ mkdir foodir.gpg $ gpg --gen-key ... $ mkdir foodir.gpg $ gpg --output foodir.gpg/testfile.random.gpg -e foodir/testfile.random $ ls -l foodir.gpg -rw-r--r-- 1 ... 501K 2008-01-08 20:07 testfile.random.gpg $ rsync -av foodir.rcrypto ben@v8tsrv:~ sent 513356 bytes received 48 bytes 342269.33 bytes/sec $ rsync -av foodir.gpg ben@v8tsrv:~ sent 513026 bytes received 48 bytes 342049.33 bytes/sec # # modify the input file starting at 17KB into the file for 6KB # $ dd if=/dev/urandom of=~/foodir/testfile.random bs=1024 count=6 seek=17 conv=notrunc $ ls -l testfile.random -rw-r--r-- 1 ... 500K 2008-01-08 20:17 testfile.random $ rsyncrypto -r foodir foodir.rcrypto foodir.keys ~/rsyncrypto-keys/rckey.crt $ gpg --output foodir.gpg/testfile.random.gpg -e foodir/testfile.random # # See how much gets sent # $ rsync -av foodir.rcrypto ben@v8tsrv:~ sent 58102 bytes received 4368 bytes 124940.00 bytes/sec $ rsync -av foodir.gpg ben@v8tsrv:~ sent 513024 bytes received 4368 bytes 1034784.00 bytes/sec
Using rsyncrypto with rsync, you can protect the files that you send to a remote system while allowing modified files to be sent in a bandwidth-efficient manner. There is a slight loss of security using rsyncrypto, because changes to the unencrypted file do not propagate throughout the entire encrypted file. When this security trade-off is acceptable, you can get much quicker bandwidth-friendly network syncs and still achieve good encryption on the files stored on the remote server.
If you wish to hide file names as well as their content on the remote server, you can use the
--name-encrypt=map option to rsyncrypto, which stores a mapping from the original file name to a garbled random file name in a mapping file, and outputs files using only their garbled random file name in the encrypted directory tree.