April 6, 2011

Inside the Linux Kernel


One of the highlights of the Linux Foundation Collab Summit's program on Wednesday was the panel of kernel developers discussing issues of kernel development and fielding audience questions. Though fairly unstructured, the panel provided an engaging peek into the world of the kernel for those of us who aren't involved in day to day development.


The panel included James Bottomley, Andrew Morton, Thomas Gleixner, and Arnd Bergmann, and moderated by Jon Corbet of LWN. Bottomley is a distinguished engineer at Novell and SCSI subsystem maintainer. Morton is the memory management subsystem maintainer. Gleixner manages bug reports for NANA flash, core timers, and x86 architecture. Bergmann is with Linaro and IBM.

One of the first topics was the current problems with the ARM architecture, which is sort of a embedded wild west at the moment. Bottomley says that we're seeing the "growing pains of ARM," as it becomes an open source architecture. Corbet says that the kernel is a victim of the kernel's success, getting a lot of contributions that the kernel folks have asked for. One panelist noted that there are about 70 sub-architectures of ARM, which are "too much" for a single maintainer and that there are "not enough people looking at the big picture."

Another challenge, is that developers now working on Linux at the behest of their employers are often developers who worked on "black box" operating systems. This encouraged some bad habits that have carried over to Linux that haven't been sorted out.

Overall, the major challenges seem to boil down to the speed that ARM developers are working at and how they're shipping devices to market. It will take some time before the market settles down to the point that they can focus on quality and good development procedures rather than shipping things out the door as quickly as possible and moving on to the next, slightly different, ARM-based product that will require a new set of patches.

That brought up the topic of code review. Bottomley said that there's a big disparity in terms of the quality of kernel code reviewers — some look at the details, but miss the big picture. "What I find, at least in storage... there are only a certain number of people I know capable of spotting all these issues." Much more training and mentoring is needed before most code reviewers are up to the challenge of producing reviews that maintainers can trust.

Catching Up to Hardware

The next topic Corbet introduced was "how do we fix the influence of Linux in the market?" Bottomley said that, yes and no, it's the companies like Dell and HP that influence the companies that make hardware. Bottomley said that some companies that "carry the torch" for Linux, others that don't. Where the companies do, it means that we have Linux support — but if not, we don't. Bottomley says we need "better representation" and mentioned plans to lobby the manufacturers of disk drives to provide better support for Linux — but says that won't have the same pull as a representative from Microsoft that requires a fix or a device won't be compatible with Windows.

"With Linux, because it's open source, the value add is that you don't have to modify hardware to work with operating system — you modify the operating system to work with hardware," says Bottomley. Which is a dual-edged sword, since it encourages bad habits from vendors.

Standards play a role, but Bottomley says that only about 10% really play a role in his kernel work. "I don't actually want to be sitting in committees going through the minutia... because it's completely irrelevant. I want to talk to people on standards committee who can talk to us about what is relevant to us... Just joining the standards body would leave us with even less time to work on the kernel."

Control Groups

Next, Corbet brought up the Control Groups feature that allow processes to be grouped and improve scheduling and behind memory management techniques and the upcoming systemd init daemon being developed by Fedora. "But a lot of kernel developers hate control groups... what's wrong with control groups? How can we fix it?" asked Corbet.

Morton said that the problem was in part the way it was developed — with minor changes being "grafted on" over a number of years, without a larger picture in mind.

Bottomley said that the concept was fine, but a lot of the controllers are "grafted on" and tasks they want to do is across several subsystems. "The concept is fine, but the way we implement them... subsystems not talking to each other, got us into the mess we're in now... instead of collaborating, they lead to 'spaghetti code'."

The answer? Gleixner says that it's time to "rewrite and clean it up... Someone has to go in at the concept level [and ask], what's wrong with the implementation, and what can we do better?"

Bottomley says that it's not as simple as that. It's not only a technical issue, but also a political issue that will require different subsystems to work together. "We have difficulty crossing our individual subsystems." Not impossible, he says, but also not easy.

Does that mean the development model is broken? Bottomley says no, the general development model scales well — "if I asked you to name five things that are like Control Groups [as a problem] you'd have a hard time doing it."

"Strategy is to beat people up who are doing it until they talk to each other, and then take credit for it" when it improves, joked Bottomley. Gleixner said that the strategy is wait until someone is grumpy enough to address a problem — maybe not the best method, but effective.


One audience member asked what the kernel folks thought of Linaro, a project for unifying middleware and low-level tools for embedded development. Not surprisingly, Bergmann (a Linaro employee) said "I think it's great!"

Bottomley said that there was "trepidation" about what it would do when it was formed, but at the moment "what has come out of it... it seems to be working reasonably well. The people who I trust who are embedded developers involved with it are fairly happy... It's a reasonably successful project."

Next 20 Years

Another question — where will be in 20 years? Bottomley says four-fifths of the panel will be retired, but coming out of retirement to fix the 2038 bug. Morton says it's possible a "quantum computer" would come along and make Linux obsolete — but it would emulate x86 "so we'll just keep going."

Corbet says "world domination is a good bet" but not quite sure what that will look like.

Gleixner says that Real Time Linux might be merged into mainline by then, and says that talks "look promising" along those lines and "I'm planning to be done with it before I retire" and plans to release a 2.6.38 RT kernel.

As Jim Zemlin said earlier in the day about mobile Linux, it's difficult to predict where things will be in five years — the 20 year outlook is just impossible to address. One thing is certain, it should be an interesting ride — and Linux will be an important part of the computing landscape at least for the foreseeable future.

Click Here!