March 14, 2006

Software RAID on OpenBSD using RAIDframe

Author: Manolis Tzanidakis

Software RAID provides an easy way to add redundancy or speed up a system without spending lots of money on a RAID adapter. OpenBSD includes support for software RAID using RAIDframe, which was ported from NetBSD, and supports RAID modes 0, 1, 4, 5.

We'll walk through creating a mirrored RAID-1 array with two IDE hard drives, to ensure that your system will continue to work without losing any data in case one drive fails. Both drives should be identical (same vendor and model) and each should be connected as the master drive in an IDE channel, since a failing drive will take down a whole IDE channel. If you use different drives, the I/O throughput and the size of the array depends on the slower and the smaller drive.

RAIDframe is not included in the OpenBSD GENERIC kernel by default because it adds about 500K to the kernel, so we have to first install OpenBSD in the primary master disk (wd0) in order to compile a kernel with RAIDframe support, then create a degraded RAID-1 mirror on the secondary master disk (wd1), then transfer the installation to that drive and afterwards add wd0 to the array.

Start the installation using your favorite method (see the OpenBSD installation guide if you are new to OpenBSD) and select wd0 for initialization. You only need two partitions for this temporary installation: wd0a for / and wd0b for the swap partition.

For this setup, a minimum installation of bsd kernel and base, comp, and etc tarballs will be fine. Feel free to install anything else you might want. Configure the network, passwords, and services as usual and reboot into the new installation.

Source code for the kernel is included on the install CD-ROM as src.tar.gz and on the OpenBSD mirrors as sys.tar.gz. Uncompress the source code from the CD-ROM by issuing the following commands as root:

mount /dev/cd0a /mnt
tar -zxvpf /mnt/src.tar.gz -C /usr/src ./sys
umount /mnt

Replace cd0 to match your CD-ROM drive device. (It would be better not to connect a CD-ROM drive as slave on one of the array drive's IDE channels, but since CDs are rarely used on a server it should not cause any troubles.)

Now is a good time to apply any patches issued since the release of your OpenBSD version, in order to avoid another time-consuming compilation later. Patches are announced on the errata page, and each patch includes instructions on how to apply the software.

Now it's time to compile and install a kernel with RAIDframe support and RAID auto-configuration. Move to the conf directory for the kernel using cd /sys/arch/'uname -m'/conf and create a new kernel configuration, GENERIC.RAID, by issuing the commands:

include "arch/'uname -m'/conf/GENERIC # include GENERIC configuration
option RAID_AUTOCONFIG # automatically configure RAIDframe arrays on boot
pseudo-device raid 4 # RAIDframe disk driver

Now, run the following commands to configure, compile, and install the kernel:

cd ../compile/GENERIC.RAID
make clean depend && make
cp /bsd /bsd.noraid
install -o root -g wheel -m 644 bsd /

After that's done, reboot the system into the RAIDframe-enabled kernel. In case something goes wrong and the new kernel doesn't boot, the old one is saved as 'bsd.noraid' and can be booted by issuing the following command at the boot prompt:

boot> boot wd0a:/boot.noraid

After successfully booting into the new kernel, initialize the second disk, wd1, using fdisk. If OpenBSD is the only installed operating system, you can use the whole disk as an OpenBSD partition with the command:

fdisk -i wd1

OpenBSD 3.8 cannot boot a kernel on a RAIDframe array, though future versions of OpenBSD should be able to, so for now we need to split the new partition into two slices: wd1a, with type 4.2BSD and size around 64MB, from which we'll boot, and wd1b, with type RAID, which will hold the RAID array. We will create these partitions with disklabel's -E option:

disklabel -E wd1

The wd1b partition must be type FS_RAID (e.g. RAID) for the auto-configuration feature. Create a new filesystem on wd1a and make it bootable:

newfs wd1a
mount /dev/wd1a /mnt
cp /bsd /usr/mdec/boot /mnt
/usr/mdec/installboot -v /mnt/boot /usr/mdec/biosboot wd1
umount /mnt

Read the boot and installboot man pages for more information. Now create a degraded RAID array using wd1b and a "fake" device, wd3b, which must exist as a device in /dev but not physically on the system:

cat >> /root/raid0.conf << EOF
START array
# numRow numCol numSpare
1 2 0

START disks
/dev/wd2b # the fake device

START layout
128 1 1 1

START queue
fifo 100

raidctl -C /root/raid0.conf raid0
raidctl -I 0603160 raid0

If both devices were present we'd normally use the -c option to create the RAID array, but since we are creating a degraded array we must force its creation and thus use -C instead. When initializing an array with -I you must specify a unique serial number. In our case that's 0603160, which is the date of initialization concatenated with 0 to show that this is the first array on the system.

To check that the array was created correctly, run:

raidctl -s raid0

As expected the parity status is shown as "DIRTY" at the moment. We must make the array auto-configurable, so that the system will initialize it before mounting the root file system, and also mark it as being eligible to contain the root partition, allowing the system to use it as a root device instead of the boot disk:

raidctl -A root raid0

Create any partitions you need on the array with disklabel -E raid0. On our system we created the following partitions: a: /, b: swap, d: /usr, e: /tmp, f: /var, g: /home .

Now create filesystems on the partitions (replace 'a d e f g' on the following 'for' loop with your partitions):

for i in a d e f g; do newfs raid0${i}; done

mount the root partition of the array, raid0a, in /mnt: mount /dev/raid0a /mnt. Create directories on it to match your mount points and mount the rest of the newly created filesystems upon them. In our case:

cd /mnt
mkdir usr tmp home var
mount /dev/raid0d /mnt/usr
mount /dev/raid0e /mnt/tmp
mount /dev/raid0f /mnt/var
mount /dev/raid0g /mnt/home

Now transfer the installation on the array and create a new fstab to match your partitions:

cd /mnt
tar -Xcpf - / | tar -xvpf -
rm /mnt/etc/fstab
cat >> /mnt/etc/fstab << EOF
/dev/raid0a / ffs rw 1 1
/dev/raid0d /usr ffs rw 1 2
/dev/raid0e /tmp ffs rw 1 2
/dev/raid0f /var ffs rw 1 2
/dev/raid0g /home ffs rw 1 2

Unmount the new partitions: umount /mnt/*; umount /mnt (ignore any errors), reboot the system, and boot into the RAID array by issuing boot> boot wd1a:/bsd at the boot prompt.

Run mount && uname -v && raidctl -s raid0 to verify that everything works correctly. Mirror wd1's structure to wd0, hot-add wd0b to the array as a hot spare, and reconstruct it:

disklabel wd1 > /root/disklabel.wd1
disklabel -R wd0 /root/disklabel.wd1
raidctl -a /dev/wd0b raid0
raidctl -vF component0 raid0

The reconstruction takes some time to finish, so take a small break. When it's done, rebuild the array's parity:

raidctl -vP raid0

Time for another break. Now make the first disk bootable again:

mount /dev/wd0a /mnt
cp /bsd /usr/mdec/boot /mnt
/usr/mdec/installboot -v /mnt/boot /usr/mdec/biosboot wd0
umount /mnt

After making the first disk bootable, reboot the system. Now, running raidctl -s raid0 should show the status for both devices as "optimal" and parity status as "clean." You can now delete unneeded files with cd /root; rm raid0.conf disklabel.wd0. Note that each time you re-compile your kernel you should install it on both wd0a and wd1a.

Before putting the system into production you should simulate a disk failure by disconnecting a drive (after powering the system off, of course) to make sure that it boots correctly, then reconnect it (power off the system first), hot-add it, reconstruct the array, and rebuild the parity as we just did. Repeat the operation with the other drive just to be sure.

To monitor the status of the array automatically you should create a shell script similar to this one:


if ! raidctl -s $ARRAY | grep -q 'Parity status: clean'; then
raidctl -s $ARRAY 2>&1 | mail -s "['hostname -s'] Array failed: $ARRAY" $MAILTO

Run the script with cron every 15 minutes and it will notify you via email if a drive fails.

Software RAID, in general, is no substitute for 'real' hardware RAID controllers with fast SCSI disks, but in most cases it provides a great solution for anyone on a budget.


  • BSD
Click Here!