Linux.com

Home News Featured Blogs Libby Clark All About the Linux Kernel: Cgroup’s Redesign

All About the Linux Kernel: Cgroup’s Redesign

Over the past few months, big changes have been underway on the cgroup Linux kernel subsystem and its related, but independent, system and service manager Systemd. Developers aren’t building shiny new features, though, as much as overhauling cgroups (control groups) to impose more structure in an area of the kernel that’s become problematic.

Tejun HeoCgroup allows fine-grained resource partitioning among competing processes running on the same machine. It’s technically a kernel subsystem but it acts quite different than typical, more isolated subsystems such as drivers or architecture-specific systems like PCI or USB. Cgroup is a conduit for other subsystems to manage and query with kernel resources such as CPU time, amounts of memory, and groups of processes.

What’s the Issue?

The problem is that cgroups were often built independently of the developers most familiar with the kernel subsystems they interact with.

“This is partly because cgroup tends to add complexity and overhead to the existing subsystems and building and bolting something on the side is often the path of the least resistance,” said Tejun Heo, Linux kernel cgroup subsystem maintainer. “Combined with the fact that cgroup has been exploring new areas without firm established examples to follow, this led to some questionable design choices and relatively high level of inconsistency.”

The biggest issue this inconsistency created was what Heo calls “a major breach of standard kernel API practices.” Because the cgroup interface is the filesystem, it goes through much less scrutiny than other kernel APIs. The hierarchical nature of cgroup means users can change permissions on subdirectories and give access to a non-privileged security domain, ie non-root users, Heo said. This, in turn, means an individual application can interact directly with the cgroup filesystem and access the kernel control knobs, effectively exposing the raw knobs to the full kernel API without the required review.

Other issues include: “the inability to designate a resource to a cgroup due to the orthogonal multiple hierarchies, widespread inconsistencies in hierarchy handling, unnecessarily high level of complexity,” and more, says Heo.

How to fix it

Cgroup is made of two parts: the cgroup core creates a hierarchical classification of processes running on the system, while a set of 13 controllers link the core with the kernel subsystems. The memory controller, for example, limits the amount of memory a group of processes can allocate from the system, the block controller can limit the bandwidth to the disk input/output, and so on.

kay sieversKernel developers are now working to fix these issues by implementing a single unified hierarchy in the cgroup core and improving consistency among the controllers.  But because of the patchwork nature of the subsystem and the need to ensure backward compatibility, they won’t be able to completely stop this abuse. That’s where systemd, and any other control agents that may emerge, comes in.

Systemd is the common tool for Linux system administrators to control resources. It relies on cgroups to track the state of services, logged-in users, and virtual machines and does so by exposing the kernel resource control knobs to the administrator, said systemd developer Kay Sievers.

Systemd and cgroup developers are working together to turn systemd into a global cgroup manager that creates higher-level control knobs and prevents direct access to the kernel. Many Systemd changes are already released while cgroup changes are set to be merged into the upstream kernel. Much work still remains, however.

The conversion of the separate controller hierarchy into a single, unified hierarchy will be a “gigantic job” for the kernel and user land, alike, Sievers said.

“When complete, the above efforts will give us far more structured way to think about and interact with cgroups,” Heo said, “which in the long run will make cgroup more useful to wider audience and enable capabilities which are currently not possible.”

For more details about the changes, see the cgroup documentation on kernel.org and the systemd man pages:

http://www.freedesktop.org/software/systemd/man/systemd.cgroup.html

http://www.freedesktop.org/software/systemd/man/systemd.slice

http://www.freedesktop.org/software/systemd/man/systemd-run.html

 

 

Comments

Subscribe to Comments Feed
  • Gus Said:

    Despite some details (like systemd not being supported by some major distros), I find it very positive that Freedesktop seems to be working to coordinate, from a high-level, the integration between low-level components (like the kernel/cgroups) and the top layer (the desktop). I wish it would happen more often, it should lead to better design and less waste. Good work, keep it up!

  • Kevin Wilson Said:

    Hi, Another very good link about cgroups - detailed explanation in pdf by Rami Rosen: (121 pages, but the first part is about namespaces): http://ramirose.wix.com/ramirosen Kevin

  • benoitc Said:

    II was wondering how the new system would work with lxc and such things? To my knowledge, lxc is directly creating the files and folders in the cgroup hierarchy. Would it means that on system running systemd, it will have to use the systemd dbus api?

  • Bob Said:

    Some followup questions to the article: Is there a timeframe for these changes? What are the implications for existing users of cgroups functionality (will changes be backwards compatible)? Since systemd is mentioned, what impact will this have on upstart-based systems?


Who we are ?

The Linux Foundation is a non-profit consortium dedicated to the growth of Linux.

More About the foundation...

Frequent Questions

Join / Linux Training / Board