December 13, 2011

Here We Go Again, Another Linux Init: Intro to systemd

In the days of yore we had a System V (SysV) type init daemon to manage Linux system startup, and it was good. It was configured with simple text files easily understood by mortals, and it was a friendly constant amid the roiling seas of change. Then came systemd, and once again we Linux users were cast adrift in uncharted waters. Why all this change? Can't Linux hold still for just a minute?

Ch Ch Ch Changes

Linux has been contentedly using sysvinit (System V initialization) to manage system startups for ever so many years now, except for distributions like Slackware that use the BSD-style init. SysV and BSD init are similar enough that it's easy to use either one without a lot of fuss.

Then came two new init systems for Linux: Ubuntu's Upstart, first released in 2006, and systemd, born in 2009. The systemd code was written primarily by Leonard Poettering. Upstart has been the default in Ubuntu since 6.10 Edgy Eft and is available in most distros. systemd is the default init in Fedora 15 and later, and is also in most distro repos for anyone who wants to try it on their favorite Linux.

Overhauling key subsystems tends to give users the jitters, because it means being forced to learn new ways to administer our systems and changing our workflow, and the prospects of essential services suffering growing pains and being less-than-reliable aren't happy-making. So what's with this new systemd thingy, and what benefits does it bring to us mere Linux users?

Faster Startups

The purpose of sysvinit is to launch userspace. At boot the kernel launches PID 1, the very first process to run at startup. (Run the pstree command to see a nice artistic ASCII diagram of your process tree.) It used to be that the BIOS and sysvinit were equal offenders in dragging boot times out to a minute or more. Both have speeded up, but sysvinit is always going to be slow because it starts processes one at a time, performs dependency checks on each one, and waits for daemons to start so more daemons can start.

So why not start processes in parallel? There is a way to do this without all kinds of complexities, and that is to take advantage of the way Unix-type daemons work. Clients of Unix daemons don't need to know if the daemons they depend on are actually running — all they need is the correct Unix domain sockets to be available. What the heck are these sockets? They are inter-process communication sockets (IPC), and they are how processes on the local system talk to each other. You can see these with netstat:


$ netstat -a --protocol=unix
Active UNIX domain sockets (only servers)
Proto RefCnt Flags       Type       State         I-Node   Path
unix  2      [ ACC ]     STREAM     LISTENING     4836     /var/run/dbus/system_bus_socket
unix  9      [ ]         DGRAM                    4584     /dev/log
unix  3      [ ]         STREAM     CONNECTED     489456   /tmp/orbit-carla/linc-
unix  3      [ ]         STREAM     CONNECTED     489455   
unix  3      [ ]         STREAM     CONNECTED     489452   /tmp/orbit-carla/


As you can see the sockets have inodes, following the tradition of "everything in Unix is a file." So you can perform various operations on them with standard Linux file utilities, which is a fun topic for another day.

So all sockets for all daemons can be created in one step, and then all daemons in a second step. Any client requests for daemons that are not yet running will be cached in the socket buffer, and then filled when the daemons are up and running. I'm no kernel hacker so maybe I'm too easily impressed, but this seems like an ingenious and efficient use of something that has been around for decades, and preferable to trying to invent something brand-new.

Hotplugging and On-Demand Services

sysvinit has a static configuration and launches processes one at a time, in order. When we configure sysvinit we've always had to be mindful of launching them in the correct order, like remembering to start networking before starting network services. And we have to be mindful that everything we might need is launched at startup, or else we will have to start it manually, because after startup sysvinit goes to sleep and doesn't do any more.

This might be adequate for simple servers, but not for desktop and mobile systems. Users roam among different networks, attach and remove all manner of devices like keyboards and headsets, audio interfaces, storage media, movies and music — thanks to Bluetooth and USB we finally have universal plug-in ports, and hotplugging devices is routine instead of an exotic adventure. Remember how, way back in olden times, we were warned to never hotplug PS/2 keyboards, mice, or IDE drives because of the risk of physical damage? Even if nothing got fried they were detected only at boot.

Auto-detecting and auto-mounting removable devices has gone through a lot of stages in Linux. Remember the fun old days of manually mounting and unmounting CDs and USB sticks? And making fun of Windows and Mac refugees who thought that was weird and dumb? Well, it was weird and dumb. But Linux was still a baby, so we had to deal with it.

Then there are network services that could be on-demand like file shares, printers, VNC, SSH, and so on. The bottom line is in these modern times way more stuff happens after startup, so instead of trying to anticipate everything you might need and start it all at boot, why not build a system that launches and stops processes on demand? As an everyday practical matter this seems to address one of my pet peeves, and that is how many distros launch Avahi and the Bluetooth daemons at startup. I have no use for either, so I always disable them. A small matter to be sure, but I like the idea of the computer handling these sorts of chores because I have real work to do.

There have been a lot of attempts at subsystems to manage dynamic handling of hardware and software: HAL (hardware abstraction layer), autofs, devfs, and all kinds of other ones I've forgotten. Now we have D-Bus for advanced inter-process communications and management, such as process lifecycle management. D-Bus uses Unix domain sockets as its transport mechanism, and it seems to be here to stay (for example, KDE and Gnome run on D-Bus). So with the extra functionality in D-Bus it seems a natural expansion of duties for systemd, as PID 1, to function as the full-time Linux process babysitter, and bring the efficiencies of parallelization and dynamic resource management to a running system, rather than simply starting the system and then going to sleep until the next reboot.

This a bare introduction to the intricacies of systemd and Linux process management. The systemd home page is a great starting point to learn more. Come back next week to learn how to manage and debug systemd on your own systems.

Click Here!