April 6, 2010

Today's Guide to Linux Virtualization

Virtualization on Linux is nothing new. It's been around now more than 10 years and has advanced considerably but that doesn't mean it's simple. To the contrary, shops have a dizzying array of virtualization options to manage workloads and storage, and to reduce complexity, costs and energy usage. There's no question about whether to deploy virtualization; the real question is what virtualization solutions to look at and what workloads to virtualize. We'll help cut through the complexity and help set the options straight.

Virtualization isn't always the answer, of course. Some workloads work best on physical servers without any abstraction. But there's quite a few advantages to using virtualization in the server room. In this feature, the first of a four-part series on virtualization, we will begin with an overview of the virtualization solutions on Linux. You can look forward to more details on standard virtualization strategies, virtual appliances, cloud computing, and more, in the weeks ahead.

What's in Virtualization for You

Virtualization isn't a hard sell these days. First and foremost, you have increased server utilization and lower energy costs by using virtualization to maximize the number of workloads running per server. In the bad old days of computing, Linux running on commodity x86 hardware tended to use only a fraction of the available computing power of a server. By using virtualization, you can ensure that the hardware is running a reasonable workload and save power by consolidating several workloads on one system.

Assuming you've chosen the right tools and mapped everything out well, virtualization also means reduced complexity and easier system management. The more advanced solutions come with well-developed management tools so working with tens or even hundreds of virtual machines is no more complicated than managing a single server from one GUI application. Using Parallels Virtuozzo Containers, for example, it takes only a few clicks to configure and deploy a standard OS template that's ready to roll. Using SUSE Studio, you can whip up a virtual appliance to deploy to VMware or Xen in just a few minutes.

Virtualization also means increased flexibility. For example, when a workload starts to outgrow its resources, virtualization means being able to either grant more system resources on the same hardware or moving the VM to a beefier server. If you're working with cloud-based solutions, the sky is the limit on workload flexibility -- but that comes at the cost of reduced flexibility when choosing your host OS and tools, as well as the fact that cloud solutions are still maturing and shaking out some of the kinks.

Note that, for the purposes of this story, we're only looking at solutions meant for server virtualization. This means that some of the outstanding workstation options, like VirtualBox, VMware Workstation, and Parallels Desktop aren't quite applicable. You can use them to run server OSes for testing, of course, but you wouldn't want to try to run mission-critical services off of the desktop virtualization solutions.

Which is fine, because the depth and breadth of server virtualization solutions for Linux provide plenty of choices for every size organization. Whether you're looking to consolidate servers in an enterprise data center, improve a small nonprofit's server infrastructure, or scope out a solution for a start-up Web 2.0 company, Linux should be at the center of the virtualization plan. The question is where it fits, and which projects or vendors to go with.

The Virtualization Landscape

When you say "virtualization," you're saying a mouthful. Mainly because the term comprises a broad range of technologies, and not all of the terms are well-defined. Initially, if you talked about virtualization on Linux you were talking about full virtualization: a host operating system running a few guest operating systems conned into thinking they were running on their own hardware. Maybe Linux, maybe other OSes, depending on what type of virtualization you're talking about. These days, you might be talking about operating system virtualization, storage virtualization or virtual appliances. And of course there's the growing and maturing "cloud computing," category. It all depends on what you need to achieve and how you want to get there.

For operating system virtualization, you'll find plenty of mature options to run on top of Linux and to run Linux on top of. The first thing to decide is whether you want to go with full virtualization or container-based virtualization, or if you want to look at cloud-based computing. Let's start by defining our terms. When I say full virtualization, I'm talking about the gamut of solutions that allow you to run one or more operating systems on top of a hypervisor like Xen, Parallels' Bare Metal, VMware ESXi, and Linux's native Kernel-based Virtual Machine (KVM).

Full virtualization is used most widely today, but container-based virtualization is a great solution for some workloads. Container-based differs in that it's not trying to run multiple OSes, it's "containing" the guest OSes in their own userspace but running on a single OS kernel. Container-based virtualization has the advantage of being more robust. To grossly oversimplify, there's less overhead involved because there's no need to deal with the overhead of emulating the hardware. The downside of container-based virtualization is that it doesn't allow running multiple OSes on the same hardware. Want to run six Linux instances on a server using container-based virtualization? No problem. Want to run a few Linux guests alongside Microsoft Server? Then you're out of luck using container-based solutions.

For container-based solutions, you can look to Parallels Virtuozzo Containers, or its open source but less full-featured cousin OpenVZ. If you're using FreeBSD, there's jails, and if you're using Solaris/OpenSolaris then there's Zones.

What's the practical difference? Using a full virtualization solution you'll be able to deploy most operating systems and mix and match on physical servers. For instance, a smaller organization might have Windows Server 2008 and SUSE Linux Enterprise Server 10 running side-by-side on the same hardware to consolidate their workloads using full virtualization on top of Xen, Parallels, KVM, or VMware, or on top of Microsoft's Hyper-V.

But in a hosting environment, where you've got the need to run dozens of Linux virtual private servers on the same box, something like OpenVZ or Virtuozzo can be a better choice.

The management toolkits and feature sets vary widely as well. KVM has been maturing rapidly but it's still not considered to be on par with Xen, VMware or Parallels solutions on several levels. The KVM status page has a list of areas that are working and are still in need of work. Note that, in the long run, KVM will probably reach parity (or very close to it) with the proprietary solutions. This is especially true given that KVM is in the mainline kernel, and therefore receives a great deal of attention from the kernel community. But in terms of management tools and features today, it's not quite a match for the other options.

Perils, Pitfalls, and Planning

Virtualization is not the magic solution to all of your computing problems, of course. In fact, if you choose the wrong solution or manage virtualization poorly, you can wind up with more headaches than you started out with. To avoid multiplying your workload and paying for the privilege, it's important to set up a roadmap for evaluating solutions, deploying them and managing them at least five years into the future.

First, make sure your organization is prepared to deal with the accounting that goes along with virtualization. Depending on the size and accounting rules, it may be necessary to split hairs over the cost of hardware. When two departments share physical hardware running virtual machines, it's not always clear who's paying for what. It's also not clear who owns the underlying hardware when push comes to shove and some of the virtual machines need more room to breathe.

Does the Web team get to elbow off the virtual machine running a developer platform in order to create a new Web head when they went halfsies on the hardware? If departments have their own IT staff, who has access to the system from top to bottom to deploy and manage virtual machines, and how is that arranged? While not inherently technical problems, these need to be mapped out ahead of time as surely as the memory and storage requirements.

You also want to guard against "virtual sprawl," which is to say the impulse to deploy virtual machines without much planning because it's far easier. It's important to manage virtual machines as if they were physical machines, when it comes to infrastructure planning and deployment.

Another consideration is the types of workload you'll be deploying, the underlying hardware, limits of the virtualization solution, and so on. While many workloads do well under virtualization, not all do, at least not without extensive planning. You'll want to make sure that resource-intensive applications have robust hardware, or perhaps reside on their own systems.

And, of course, training is extremely important. While today's virtualization tools shouldn't pose a great challenge to a competent system administrator, there's an enormous difference between getting by and being skilled at using the tools. If your organization is planning a virtualization deployment with a new toolset, training should be part of the process and budget.

We've only scraped the surface in this first story in the series, but we'll be doing a deeper dive into virtualization throughout April. In the coming weeks, we'll take a look at the best strategies for working with virtualization in the enterprise, a more in-depth conversation about the differences between container-based and full virtualization, and a forecast on cloud computing and what it might mean for your organization.

Click Here!