Condensing Your Infrastructure with System Containers


When most people hear the word containers, they probably think of Docker containers, which are application containers. But, there are other kinds of containers, for example, system containers like LXC/LXD. Stéphane Graber, technical lead for LXD at Canonical Ltd., will be delivering two talks at the upcoming Open Source Summit NA in September: “GPU, USB, NICs and Other Physical Devices in Your Containers” and “Condensing Your Infrastructure Using System Containers” discussing containers in detail.  

In this OS Summit preview, we talked with Graber to understand the difference between system and application containers as well as how to work with physical devices in containers. What are system containers, how are they different from virtual machines?

Stéphane Graber: The end result of using system containers or a virtual machine is pretty similar. You get to run multiple operating systems on a single machine.

The VM approach is to virtualize everything. You get virtualized hardware and a virtualized firmware (BIOS/UEFI) which then boots a full system starting from bootloader, to kernel, and then userspace. This allows you to run just about anything that a physical machine would be able to boot but comes with quite a bit of overhead for anything that is virtualized and needs hypervisor involvement.

System containers, on the other hand, do not come with any virtualized hardware or firmware. Instead, they rely on your existing operating system’s kernel and so avoid all of the virtualization overhead. As the kernel is shared between host and guest, this does, however, restrict you to Linux guests and is also incompatible with some workloads that expect kernel modifications.

A shared kernel also means much easier monitoring and management as the host can see every process that’s running in its containers, how much CPU and RAM each of those individual tasks are using, and it will let you trace or kill any of them. What are the scenarios where someone would need system containers instead of, say VM? Can you provide some real use cases where companies are using system containers?

Graber: System containers are amazing for high-density environments or environments where you have a lot of idle workloads. A host that could run a couple hundred idle virtual machines would typically be able to run several thousand idle system containers.

That’s because idle system containers are treated as just a set of idle processes by the Linux kernel and so don’t get scheduled unless they have something to do. Network interrupts and similar events are all handled by the kernel and don’t cause the processes to be scheduled until an actual request is coming their way.

Another use case for system containers is access to specialized hardware. With virtual machines, you can use PCI passthrough to move a specific piece of hardware to a virtual machine. This, however, prevents you from seeing it on the host, and you can’t share it with other virtual machines.

Because system containers run on the same kernel as the host. Device passthrough is done at the character/block device level, making concurrent access from multiple containers possible so long as the kernel driver supports it. LXD, for example, makes it trivial for someone to pass GPUs, USB devices, NICs, filesystem paths and character/block devices into your containers. How are system containers different from app containers like Docker/rkt?

Graber: System containers will run a full, usually unmodified, Linux distribution. That means you can SSH into such a container the you can install packages, apply updates, use your existing management tools, etc. They behave exactly like a normal Linux server would and make it easy to move your existing workloads from physical or virtual machines over to system containers.

Application containers are usually based around a single process or service with the idea that you will deploy many of single-service containers and connect them together to run your application.

That stateless, microservice approach is great if you are developing a new application from scratch as you can package every bit of it as separate images and then scale your infrastructure up or down at a per-service level.

So, in general, existing workloads are a great fit for system containers, while application containers are a good technology to use when developing something from scratch.

The two also aren’t incompatible. We support running Docker inside of LXD containers. This is done thanks to the ability to nest containers without any significant overhead. When you say condensing your infrastructure what exactly do you mean? Can you provide a use case?

Graber: It’s pretty common for companies to have a number of single-purpose servers, maybe running the company PBX system, server room environment monitoring system, network serial console, etc.

All of those use specialized hardware, usually through PCI cards, serial devices or USB devices. The associated software also usually depends on specific, often outdated version of the operating system.

System containers are a great fit there as you can move those workloads to containers and then just pass the different devices they need. The end result is one server with all the specialized hardware inside it, running a current, supported Linux distribution with all the specialized software running in their individual containers.

The other case for condensing your infrastructure would be to move your Linux virtual machines over to LXD containers, keeping the virtual machines for running other operating systems and for those few cases where you want an extra layer of security. Unlike VMs, how do system containers deal with physical devices?

Graber: System containers see physical devices as UNIX character or block devices (/dev/*). So the driver itself sits in the host kernel with only the resulting userspace interface being exposed to the container. What are the benefits or disadvantages of system containers over VMs in context of devices?

Graber: With system containers, if a device isn’t supported by the host kernel, the container won’t be able to interact with it. On the other hand it also means that you can now share supported devices with multiple containers. This is especially useful for GPUs.

With virtual machines, you can pass entire devices through PCI or USB passthrough with the driver for them running in the virtual machine. The host doesn’t have to know what the device is or load any driver. However, because a given PCI or USB device can only be attached to a single virtual machine, you will either need a lot more hardware or constantly change your configuration to move it between virtual machines.

You can see the full schedule for Open Source Summit here and save $150 through July 30. readers save an additional $47 with discount code LINUXRD5Register now!