October 26, 2011

The Kernel Panel at LinuxCon Europe

Linux users got a rare opportunity to hear directly from the hackers at the core of the Linux kernel on Wednesday at LinuxCon Europe. Read on for more on the state of ARM in the Linux kernel, the need for new kernel contributors, and the death of the Big Kernel Lock (BKL).

Linus Torvalds and other kernel developers sat down for a question and answer session at the first LinuxCon Europe. Lennart Poettering, creator of PulseAudio and systemd, served as moderator for the panel, which consisted of Torvalds, Alan Cox, Thomas Gleixner, and Paul McKenney. The four took prepared questions from Poettering, as well as responding to impromptu audience member questions on every topic from version numbers to the future of the kernel project itself.

The panelists introduced themselves first. Torvalds, of course, is the leader of the project. Cox works primarily in system-on-chip these days (although he has had other roles in the past). Gleixner maintains the real-time patches. McKenney works on the read-copy-update (RCU) mechanism. Together they account for just a tiny fraction of the kernel community, but by their roles and experience offer keen insights into the health of the kernel, the health of the kernel community, and the directions it is heading.

Poettering opened by asking the group about a frequently-quoted comment by Torvalds that breaking userspace compatibility was something that had to be avoided at all costs. A nice sentiment, Poettering said, but one that sounds hypocritical considering how often the kernel team really does break compatibility with its releases — sometimes even with trivial changes like the switch from 2.6.x version numbers to 3.0 that happened earlier this year.

Torvalds pointed out that any program which assumed that the kernel's version number would always start with a "2" was broken already, but that the kernel team had also added a "compatibility" mode that would report its version number as 2.6.40 if buggy programs needed it. That encapsulated the kernel team's approach: add something new, but do everything possible to make sure that the old way of doing things continued to work.

Torvalds added that he used to keep ancient binaries around — including the very first shells written for Linux, which used APIs that were deprecated within months — to test against each new kernel release, just to make sure that old code continued to run. API stability is important, he said, but it flows out of not breaking the user experience. "No project is more important than the users of that project," he summarized.

Next, Poettering asked if there was an aging problem with the core kernel development team, noting that the average age of the subsystem maintainers was growing. Torvalds said no, but that it sometimes seemed like it simply because it takes time for a new contributor to "graduate" from maintaining a single driver to managing a set, and eventually to managing an entire subsystem. The others agreed; Cox added that there was plenty of "fresh blood" in the project, in fact more than enough, but there was a bigger problem in the gender gap — a problem that no one seemed able to fix, despite years of trying. Most of the female kernel contributors today work for commercial vendors, he said; with very few participating because of their own hobbyist interest. Torvalds added that another reason it often seems like the kernel crowd is aging rapidly is how ridiculously young they were when they started — he was only 20 himself, and several other key contributors were still teenagers.

Audience members asked questions from microphones placed at the edge of the stage, and several had questions about specific features: the Big Kernel Lock (BKL), the complexity of the ARM tree, and whether or not embedded Linux developers were given as much attention as developers working on desktop and server platforms. Cox reminisced about the BKL, which he called the right solution for the early days of multi-processor support in Linux, even though it had subsequently taken years to replace with more sophisticated methods. It was always a nuisance, he said, but it got Linux SMP support much faster than other OSes, such as the BSDs.

The ARM architecture was controversial in recent months, after Torvalds had to resort to tough talk to get the ARM family to clean up its code and standardize more. The situation is much better today, Torvalds reported. The problem, he said, is that while ARM has a standardized instruction set for the processor, every ARM board has a different approach to other important things like timers and interrupts. Intel had never faced such a glut of incompatible standards for the x86 architecture because the PC platform was so uniform, so it has taken a while for ARM to see the need to take a more active approach towards standardization. Torvalds also said that for the most part the kernel team is very interested in embedded development; what gets tricky is that most embedded Linux devices are designed to be built once and never upgraded. That makes it harder to do testing and ongoing kernel support for embedded platforms.

When asked about the challenges facing the Linux kernel over the next few years, McKenney cited a number of research topics facing all operating systems: scalability, real-time, and validation, to name a few. Torvalds said maintaining the right balance between complexity and the ability to get things done. The sheer amount of new hardware that comes out every year is overwhelming, he said; keeping up with it is a practical (though not theoretical) challenge for the team.

Speaking of the practical, one audience member asked Torvalds what his process was when getting a new pull request from a contributor. "I manage by trusting people," he replied. Whenever a pull request comes in, he looks at the person who sent it. Depending on the person, his process varies: some people have earned enough trust over the years that he believes in their judgment, while others have their own recurring issues that mandate additional review.

In any case, he said, he makes sure that the new code compiles on his own machine because he "hates it" when he can't compile for his own configuration. But for the most part, he said, his role is no longer to validate individual pieces of code, but to orchestrate the work of others. If he knows two people are working in a similar part of the kernel, he needs to be aware of it to avert clashes, but he trusts the maintainers of their individual subsections. That trust is given to the person, he reiterated; the individual earns it, not the company that the individual might work for.

The panel touched on other areas as well, including security, cgroups, and subsystem bloat. In each case, was comes across in a panel discussion such as this is how human the process of writing and maintaining the kernel is. The kernel team can make mistakes, and they have to route around them with bugfixes in subsequent releases. Maintainers may not be interested in a particular area of development, but they will look at and even integrate the patches because they are important to a subset of developers or kernel users.

The kernel may have steep technical challenges, but just as real a threat to productivity is burnout among maintainers. It is fun to watch the kernel team wisecrack and comment on stage, but it is also a healthy reminder that above all else, Linux is a collaborative project, not simply lines of code.

Click Here!