Hard drives are slow and fail often, and though abolished for working memory ages ago, fixed-size partitions are still the predominant mode of storage space allocation. As if worrying about speed and data loss weren't enough, you also have to worry about whether your partition size calculations were just right when you were installing a server or whether you'll wind up in the unenviable position of having a partition run out of space, even though another partition is maybe mostly unused. And if you might have to move a partition across physical volume boundaries on a running system, well, woe is you.
This article is excerpted from the newly published book The Offical Ubuntu Book, Third Edition published by Prentice Hall Professional, June 2008, Copyright 2008 Canonical, Ltd.
RAID helps to some degree. It'll do wonders for your worries about performance and fault tolerance, but it operates at too low a level to help with the partition size or fluidity concerns. What we'd really want is a way to push the partition concept up one level of abstraction, so it doesn't operate directly on the underlying physical media. Then we could have partitions that are trivially resizable or that can span multiple drives, we could easily take some space from one partition and tack it on another, and we could juggle partitions around on physical drives on a live server. Sounds cool, right?
Very cool, and very doable via logical volume management (LVM), a system that shifts the fundamental unit of storage from physical drives to virtual or logical ones. LVM has traditionally been a feature of expensive, enterprise Unix operating systems or was available for purchase from third-party vendors. Through the magic of free software, a guy by the name of Heinz Mauelshagen wrote an implementation of a logical volume manager for Linux in 1998, which we'll refer to as LVM. LVM has undergone tremendous improvements since then and is widely used in production today, and just as you expect, the Ubuntu installer makes it easy for you to configure it on your server during installation.
LVM theory and jargon
Wrapping your head around LVM is a bit more difficult than with RAID because LVM rethinks the whole way of dealing with storage, which expectedly introduces a bit of jargon that you need to learn. Under LVM, physical volumes, or PVs, are seen just as providers of disk space without any inherent organization (such as partitions mapping to a mount point in the OS). We group PVs into volume groups, or VGs, which are virtual storage pools that look like good old cookie-cutter hard drives. We carve those up into logical volumes, or LVs, that act like the normal partitions we're used to dealing with. We create filesystems on these LVs and mount them into our directory tree. And behind the scenes, LVM splits up physical volumes into small slabs of bytes (4MB by default), each of which is called a physical extent, or a PE.
You take a physical hard drive and set up one or more partitions on it that will be used for LVM. These partitions are now physical volumes (PVs), which are split into physical extents (PEs) and then grouped in volume groups (VGs), on top of which you finally create logical volumes (LVs). It's the LVs, these virtual partitions, and not the ones on the physical hard drive, that carry a filesystem and are mapped and mounted into the OS. If you're confused about what possible benefit we get from adding all this complexity only to wind up with the same fixed-size partitions in the end, hang in there. It'll make sense in a second.
The reason LVM splits physical volumes into small, equally sized physical extents is that the definition of a volume group (the space that'll be carved into logical volumes) then becomes "a collection of physical extents" rather than "a physical area on a physical drive," as with old-school partitions. Notice that "a collection of extents" says nothing about where the extents are coming from and certainly doesn't impose a fixed limit on the size of a volume group. We can take PEs from a bunch of different drives and toss them into one volume group, which addresses our desire to abstract partitions away from physical drives. We can take a VG and make it bigger simply by adding a few extents to it, maybe by taking them from another VG, or maybe by tossing in a new physical volume and using extents from there. And we can take a VG and move it to different physical storage simply by telling it to relocate to a different collection of extents. Best of all, we can do all this on the fly, without any server downtime.
Setting up LVM
Surprisingly enough, setting up LVM during installation is no harder than setting up RAID. Create partitions on each physical drive you want to use for LVM just as you did with RAID, but tell the installer to use them as physical space for LVM. Note that in this context, PVs are not actual physical hard drives; they are the partitions you're creating.
You don't have to devote your entire drive to partitions for LVM. If you like, you're free to create actual filesystem-containing partitions alongside the storage partitions used for LVM, but make sure you're satisfied with your partitioning choice before you proceed. Once you enter the LVM configurator in the installer, the partition layout on all drives that contain LVM partitions will be frozen.
Consider a server with four drives, which are 10GB, 20GB, 80GB, and 120GB in size. Say we want to create an LVM partition, or PV, using all available space on each drive, and then combine the first two PVs into a 30GB volume group and the latter two into a 200GB one. Each VG will act as a large virtual hard drive on top of which we can create logical volumes just as we would normal partitions.
As with RAID, arrowing over to the name of each drive and pressing Enter will let us erase the partition table. Then pressing Enter on the FREE SPACE entry lets us create a physical volume -- a partition that we set to be used as a physical space for LVM. Once all three LVM partitions are in place, we select Configure the Logical Volume Manager on the partitioning menu.
After a warning about the partition layout, we get to a rather spartan LVM dialog that lets us modify VGs and LVs. According to our plan, we choose the former option and create the two VGs we want, choosing the appropriate PVs. We then select Modify Logical Volumes and create the LVs corresponding to the normal partitions we want to put on the system -- say, one for each of /, /var, /home, and /tmp.
You can already see some of the partition fluidity that LVM brings you. If you decide you want a 25GB logical volume for /var, you can carve it out of the first VG you created, and /var will magically span the two smaller hard drives. If you later decide you've given /var too much space, you can shrink the filesystem and then simply move over some of the storage space from the first VG to the second. The possibilities are endless.
Remember, however, that LVM doesn't provide redundancy. The point of LVM is storage fluidity, not fault tolerance. In our example, the logical volume containing the /var filesystem is sitting on a volume group that spans two hard drives. This means that either drive failing will corrupt the entire filesystem, and LVM intentionally doesn't contain functionality to prevent this problem.
When you need fault tolerance, build your volume groups from physical volumes that are sitting on RAID. In our example, we could have made a partition spanning the entire size of the 10GB hard drive and allocated it to physical space for a RAID volume. Then, we could have made two 10GB partitions on the 20GB hard drive and made the first one also a physical space for RAID. Entering the RAID configurator, we would create a RAID 1 array from the 10GB RAID partitions on both drives, but instead of placing a regular filesystem on the RAID array as before, we'd actually designate the RAID array to be used as a physical space for LVM. When we get to LVM configuration, the RAID array would show up as any other physical volume, but we'd know that the physical volume is redundant. If a physical drive fails beneath it, LVM won't ever know, and no data loss will occur. Of course, standard RAID array caveats apply, so if enough drives fail and shut down the array, LVM will still come down kicking and screaming.
If you've set up RAID and LVM arrays during installation, you'll want to learn how to manage the arrays after the server is installed. We recommend the respective how-to documents from The Linux Documentation Project at http://www.tldp.org/HOWTO/Software-RAID-HOWTO.html and http://www.tldp.org/HOWTO/LVM-HOWTO. The how-tos sometimes get technical, but most of the details should sound familiar if you've understood the introduction to the subject matter here.