July 25, 2006

Live migration of Xen domains

Author: Paul Virijevich

Virtualization is all the rage these days. Advances in x86 performance, as well as the increasing energy requirements of servers, make efficiently provisioning machines a necessity. Xen, an open source virtual machine (VM) monitor, works with just about any Linux distribution. One useful feature for shops that care about high availability is Xen's ability to migrate virtual machines while they are running. By using VM migration, you can pool computing resources just as you can pool storage. Here's how.

The easiest way to install Xen is to use your distribution's package manager. The latest editions of SUSE and Fedora Core make Xen installation a breeze. You can also get source and binary downloads from XenSource, the commercial company behind Xen. BitTorrent downloads are open to anyone; HTTP downloads require an email address. For this article I will utilize SUSE 10.1, which supports Xen out of the box.

The other requirement is access to shared storage. This could be a disk on a storage area network (SAN), but I found the easiest and least expensive way for testing out live migration is to use ATA over Ethernet (AoE) for shared storage. AoE is included in the kernels shipping with most free distributions, and downloadable as well. The machine holding the shared storage works as an AoE target (think server), while the client machines (running Xen domains) will use the AoE kernel module as a initiator. For more information on how to get AoE up and running, see the article "Reduce network storage cost, complexity with ATA over Ethernet."

This setup requires a minimum of three servers -- one to host the shared storage, and two to allow the migration of the VM. To help keep things straight, name the machines xen_storage, xen_resource_1, and xen_resource_2. Perform a basic minimal install on all three machines. When installing xen_storage, create a separate partition to export with AoE, but do not format this partition. Once this is done, boot both resource machines and xen_storage into Xen. Remember, xen_storage is only acting as shared storage and does not need to know anything about Xen.

It's a good idea to enter the above machine names into the hosts file of all three machines. You can also add the IP of the soon-to-be VM. Here is a suggestion for IP addressing:

  • xen_storage -- 192.168.0.10
  • xen_resource_1 -- 192.168.0.20
  • xen_resource_2 -- 192.168.0.30
  • vm1 -- 192.168.0.40

This will help keep things straight when you're doing the live migration.

To get the shared storage up and running, install the vblade program on xen_storage. Vblade allows you to export local disks or partitions as AoE devices. It installs easily with a simple make linux. To export the storage partition use the command:

./vbladed 0 0 eth0 /dev/sda3

where sda3 is the extra partition created during installation. At this point, you are done with xen_storage.

Now it's time to create an install image to boot the VM from. My solution makes use of the yast dirinstall command in SUSE. This runs the SUSE installer, but places the installation into a directory of your choosing. It also allows you to create an image of the installation. The basic idea is that you mount your AoE device, then copy the contents of this directory or image file into it. Xen will then be able to boot a VM from this device. Let's take it step by step.

On xen_resource_1, creating a temporary directory to hold the install image -- for example, /tmp/xen. Now, fire up YaST with the command yast dirinstall. Leave the default installation directory alone. What we really want to get out of this is an install image. Select Options and choose Create Image with /tmp/xen as the directory. Be creative and name the image xenimage. Next, change the Software option to minimal install. Finally, go into Detailed Selection and select the package xen-kernel. This will install the needed Xen kernel files into the boot directory of the image.

Sit back and let YaST do its magic. When it's finished, you will have a 146MB file named xenimage.tgz in the directory /tmp/xen.

Now let's get this image ready to boot a VM. Load the AoE module and confirm that xen_Resource_1 can see the shared storage with:

modprobe aoe;aoe-discover;aoe-stat

You should now be able to see the exported AoE device at /dev/etherd/e0.0. Next, create a physical volume, a volume group named vg_xen, and a 5GB logical volume named lv_vm1 with:

pvcreate /dev/etherd/e0.0
vgcreate vg_xen /dev/etherd/e0.0
lvcreate -L 5g -n lv_vm1 vg_xen

Now put a file system on the logical volume and mount it with:

mkfs.reiserfs /dev/vg_xen/lv_vm1; mount /dev/vg_xen/lv_vm1 /mnt

Issue a df -h command to verify that you have a 5GB file system available on /mnt. Remember, this 5GB is coming from xen_storage.

Extract xenimage.tgz with:

tar -zxvf /tmp/xen/xenimage.tgz -C /mnt/

The /mnt directory now looks a lot like a typical / directory. However, there are a few more changes we need to make before it's usable. The following commands do the trick:

cp /etc/passwd /etc/shadow /mnt/etc
echo "/dev/hda1 / reiserfs defaults 1 1" > /mnt/etc/fstab
sed -ie "s/^[2-6]:/#\\0/" /mnt/etc/inittab

These commands set up the password file, create an fstab file so the domain can mount a root filesystem, and modify the inittab file to start a login only in runlevel one. This is required even though the VM actually starts in runlevel three. If you are confused, so am I. However, if you don't do this, the VM never gets to a login prompt; it will just sit there re-spawning gettys. I found these useful tips here.

Now it is time to make the shared storage available to xen_resource_2. Here are the commands to load the AoE driver on xen_resource_2 and activate the logical volume:


modprobe aoe
vgscan
vgchange -a y

When you've run these, you should be able to see the entry /dev/vg_xen/lv_vm1 on xen_resource_2. If you do, set up the configuration file for the VM and fire it up.

Both resource machines will use an identical configuration file. On xen_resource_1, create the file /etc/xen/vm/vm1. The contents of the file should look like this:


# -*- mode: python; -*-
# configuration name:
name = "vm1"
# usable ram:
memory = 256
# kernel und initrd:
kernel = "/boot/vmlinuz-xen"
ramdisk = "/boot/initrd-xen"
# boot device:
root = "/dev/hda1"
# boot to run level:
extra = "3"
# network interface:
vif = [ 'mac=aa:cc:00:00:00:01, bridge=xenbr0' ]
hostname = name
# storage devices:
disk = [ 'phy:xen/lv_vm1,hda1,w' ]

Copy this file to xen_resource_2.

The last thing we need to do is to change Xen daemon's configuration file to allow live migration. Edit /etc/xen/xend-config.sxp and remove the comment character ("#") from these two lines:


#(xend-relocation-port 8002)
#(xend-relocation-address '')

The first line tells Xen to listen for incoming migration requests on TCP port 8002. The second line allows connections from any host. This behavior can be locked down, but leave it be for testing purposes.

Now you can test out a live migration. Restart the daemon on both resource machines with rcxend restart.

Start up the VM on xen_resource_1 with xm create vm1 -c. This boots up the VM and takes you to a console login. Log in using credentials from xen_resource_1. Take a look around for a minute or two -- everything should appear as if it is a normal machine. You need to set the IP address for the VM. You can use YaST or good old ifconfig. Give it an IP address of 192.168.0.40, and return to your master VM by entering Ctrl-].

To view a list of running domains, issue the command xm list. Both the master domain (Domain-0) and vm1 should show up in the listing. Now, ping the IP address of vm1 and make sure it is on the network. In fact, ping vm1 from xen_resource_1 and let the ping messages scroll on and on. Remember, as far as the network is concerned, the IP address is physically attached to xen_resource_1. It's about time for some fancy live virtual machine migration.

Open up a new terminal on xen_resource_1 and issue the following command to migrate vm1 to xen_resource_2:

xm migrate --live vm1 xen_resource_2

Notice how the ping messages keep scrolling by uninterrupted. Behold the power of live migration. After a few seconds, log into xen_resource_2 and check to see whether vm1 has migrated by issuing the command xm list. You should see vm1 listed. If the ping were running from another machine on the network, you could pull the power on xen_resource_1 and the ping would keep going. Xen has migrated vm1 in its entirety to xen_resource_2, and the ping did not even hiccup.

I hope that you can see the utility of this setup. With Xen and live migration, hardware can be replaced or upgraded without interruptions in service. Applications can be freed from the hardware they run on. Best of all, this is all done with no interruptions in service.