Author: Ben Martin
The Linux kernel includes support for performing RAID-1 in software. RAID-1 maintains the same filesystem on two or more disks, so that you can lose all but the last disk and still retain all of your data. This seems wonderful until you consider that an error in RAM, a power supply failure, or another hardware component in the machine can still potentially corrupt your precious data. With Chiron FS you can maintain a RAID-1 on the disks of two machines over a network, so if one machine goes down, you’ll still be able to access your filesystem.
Of course, since you are still maintaining a RAID-1, if a machine malfunctions but leaves the network up, then Chiron FS might happily replicate bad files from a failing machine to the others, but that’s a risk you may be willing to take, considering all you need is two low cost PCs and a network link. You can always add other solutions to improve your data security and availability.
The main requirement for Chiron FS is that the places you wish to maintain your RAID-1 be mountable as a filesystem through the Linux kernel. For example, you could use a locally mounted ext3 filesystem and an XFS filesystem that is mounted from a backup machine over NFS or sshfs, and use Chiron FS to make both filesystems store the same contents.
Chiron FS is available as a 1-Click install for openSUSE 10.3 but is not included in the distribution repositories for Fedora or Ubuntu. The download page for Chiron FS includes packages for Ubuntu Hardy and Gutsy, along with Fedora 8. I built the software from source on a 64-bit Fedora 8 machine using Chiron FS version 1.0.0. Installation follows the normal
./configure; make; sudo make install process.
Shown below is setup and simple usage of a Chiron FS. First I create a directory that will contain the filesystems I wish to replicate. In this example my-other-server-filesystem is on the same disk as local-disk-filesystem, but it could easily be on an NFS-mounted filesystem. The
--fsname argument is not necessary for Chiron FS but it is always a good idea to give a descriptive name to your filesystems where possible. Next, I create a file in the Chiron FS ~/my-chiron directory and check that it exists in the replica filesystem /tmp/chiron-testing/my-other-server-filesystem.
$ mkdir /tmp/chiron-testing $ cd /tmp/chiron-testing $ mkdir local-disk-filesystem $ mkdir my-other-server-filesystem $ mkdir ~/my-chiron $ chironfs --fsname chiron-testing /tmp/chiron-testing/local-disk-filesystem=/tmp/chiron-testing/my-other-server-filesystem ~/my-chiron $ df ~/my-chiron Filesystem 1K-blocks Used Available Use% Mounted on chiron-testing 15997880 10769840 4402288 71% /home/ben/my-chiron $ date > ~/my-chiron/test1 $ cat ~/my-chiron/test1 Thu May 29 15:17:30 EST 2008 $ cat /tmp/chiron-testing/my-other-server-filesystem/test1 Thu May 29 15:17:30 EST 2008 ... $ fusermount -u ~/my-chiron
When using Chiron FS it is a good idea to hide the replica filesystems from being directly accessed (those in /tmp/chiron-testing in the example) so that you don’t use them inadvertently and thus circumvent data replication.
Chiron FS includes logging and support for slow mirrors. Logging can let you know when a replica has failed so you can take countermeasures before the final replica fails and you lose all your data. When a read request is issued to Chiron FS, it will obtain the data from one of the replicas using a round-robin algorithm. You can stipulate that a particular replica is slower than the others so that Chiron FS will avoid it when data must be read but still replicate all filesystem updates to that slower replica. This is useful if you are maintaining an offsite backup with Chiron FS; it lets you avoid the speed and cost penalty of reading data back from your offsite replica while still replicating filesystem changes offsite.
You can also set up your Chiron FS to be mounted through /etc/fstab. In the setup below I create a filesystem at /data which will replicate its data to both /noaccess/local-mirror and over NFS to the /p1export filesystem on the machine p1. The colon before the /p1export in the fstab file tells Chiron FS that this is a slower replica, so it will not try to read from the NFS filesystem. Because Chiron FS is a FUSE filesystem, you have to specify the allow_other option in order to let users other than the user who initially mounted the Chiron FS have access to it.
# cat /etc/fstab ... p1:/p1export /p1export nfs defaults 0 0 chironfs#:/p1export=/noaccess/local-mirror /data fuse allow_other,log=/var/log/chironfs.log 0 0 ... # mount /data ...
I used the above setup with a single NFS replica to perform some benchmarking. For the tests, both the machine that is running the Chiron FS and the machine “p1” are running in virtual machines on an Intel Q6600 CPU. As the machine has hardware virtualization, the main effect on performance figures of running inside a virtual machine will be on the network link between the two virtual machines. Virtualization, in theory, could have a positive effect for performance because both virtual machines are operating on the same physical hardware. The figures taken for accessing a filesystem locally in the virtual machine vs. access over NFS show that there is a significant impact of using the network for this benchmark. However, as tests are all performed on the same virtual machines, the relative performance should give an impression of what performance changes you can expect.
In the tests I use /noaccess/raw to measure the speed with which the machine can access the same filesystem on which the local replicas are stored with Chiron FS. I also use a /data-localonly filesystem, which consists of two local replicas which are stored in /noaccess. I created the /data-localonly Chiron FS to remove the virtual network link as a factor from the benchmark. As a reminder, the network Chiron FS is using an NFS share mounted at /p1export. Results are shown below.
|/noaccess/raw||local kernel filesystem x 1||20||0.6|
|/data-localonly||local kernel filesystem x 2||42||5.4|
|/p1export||nfs filesystem x 1||90||20|
|/data||local kernel filesystem x 1||150||25|
|and nfs filesystem x 1|
As you can see, using the /data-localonly Chiron FS requires about twice as long as using two local filesystems directly would have cost. The NFS filesystem is very slow to access relative to the local filesystem. This performance penalty of using NFS is carried through to the /data Chiron FS, where operations required more than the sum of a local and NFS filesystem to perform. These results indicate that a single slower replica in a Chiron FS can have significant negative performance implications for the Chiron FS as a whole.
To test the colon option, which tells Chiron FS that one replica (the NFS filesystem) is slower than the other (the local disk), I also created a tarball of the extracted Linux kernel, reading the files from both the /data and /noaccess/local-mirror directories. The colon option can only affect read performance, because writes, by their nature, have to be performed on every replica to maintain consistency. For the test, /data is a Chiron FS with both /noaccess/local-mirror and the /p1export NFS filesystem. To create a tarball from /data took about 30 seconds; creating one from /noaccess/local-mirror took only about 6 seconds. Creating a tarball directly on the NFS filesystem (/p1export) took about 30 seconds. The fact that it took the same amount of time on both /data and /p1export leads me to believe that the colon is currently being ignored or that I could not get it to be effective, and thus reads were being distributed to both replicas.
Removing all the files from the extracted kernel sources required a substantial amount of extra time when using Chiron FS. The extra time is not prohibitive, but is definitely noticeable. The extraction of the tarball, on the other hand, took around twice as long when using Chiron FS with two local filesystems, so there is no significant performance overhead imposed by Chiron FS for extraction and writing. As each write to the Chiron FS required two writes to disk, the speed with which Chiron FS performs the tarball extraction cannot be significantly improved. When using Chiron FS with a filesystem mounted with NFS, you must suffer with the slower write performance of the NFS server during operation.
It appears that the current Chiron FS ignores the colon option that should allow you to mark a replica as write-only. Once this functionality is restored you should be able to tell Chiron FS not to read from the NFS server, so you can avoid network congestion and round trips on reads and limit the performance degradation to writes. If you are already happy with your NFS performance, using Chiron FS should not slow things down a great deal and can provide greater filesystem availability and a degree of protection against hardware failure at a very low cost.
- Backup & Data Recovery
- System Administration