July 25, 2005

Ottawa Linux Symposium, Day 4

Author: David 'cdlu' Graham

The final day of the Ottawa Linux symposium was highlighted by this year's keynote address, delivered by Red Hat's lead Linux kernel developer, Dave Jones.

The day's regular events got going at noon with a session by S├ębastien Decugis and Tony Reix of European company Bull entitled "NPTL stabilisation project".

Cleaning up loose threads

Their presentation discussed a 14-month long project at their company to thoroughly test the relative young Native POSIX Thread Library (NPTL). Over the course of their tests, they ran over 2000 individual conformance tests one at a time, each in a two-stage process.

The first stage of each test was consulting the POSIX standard assertions, comparing it to the operational status of the library's routines, and coming up with a test.

The second stage of each test was to write test code for the case, run it, and evaluate the results. The results log-files could be up to 300 kilobytes and initially took two days to read.

To help, they used a project called TSLogParser to cut down the process to just 15 minutes. The results table made it easier to understand what worked and what failed in great detail without trying to understand lengthly and detailed log files.

To date they have found 22 bugs in glibc, 1 in the kernel, and 6 in the POSIX standard itself where specifications are unclear or contradictory.

No room at the Inn-ternet

After the discussion of NPTL, which was far more in depth than related here, I went on to a talk by Hideaki Yoshifuji of Keio University and the USAGI project entitled "Linux is now IPv6 ready".

In the middle of the 1980s, Yoshifuji told us, discussing the history of IPv6, the IPv4 address space was already filling. The Internet Engineering Task Force, seeking to head off the problem of IPv4's limited - slightly over four billion possible addresses - address space and created the concept of IPNG, or Internet Protocol: Next Generation, to become known as IPv6.

IPv6 came to life with the introduction of the specifications found in RFC 1883 in December of 1995.

IPv6 introduced a couple of significant changes over IPv4. For one, IPv6 is 128 bits giving it a theoretical address space of 340,282,366,920,938,463,463,374,607,431,768,211,45 6 (approximately 3.402*10^38, or 5.6*10^14 moles for chemistry folks) possible IP addresses, which should have less risk of running out than the current 32 bits which provide fewer addresses than there are people on the planet.

The new Internet Protocol also implements the IPSec security feature, a simpler routing architecture, and mobility, allowing IP addresses not to be locked to particular routes.

IPv6 was first introduced into the Linux kernel in the 2.1 development tree in 1996. At the time, neither mobile IPv6 nor IPSec was included.

In 2000, the Universal Playground for IPv6 (USAGI) project was born with the intent of being the Linux IPv6 development project. The USAGI project implemented IPSec into Linux' IPv6 implementation first based on the FreeS/WAN project.

In September 2003, USAGI began testing Linux' IPv6 support against the ipv6ready.org certification program's basic tests. In February, 2005, the advanced tests were completed and Linux kernel 2.6.11rc-2 was certified for IPv6. More recent versions of the kernel have yet to be tested but are expected to pass.

Yoshifuji's slides from his presentation will be posted on line shortly.

Jones on bug reporters

This year's keynote address was delivered by Red Hat kernel maintainer and kernel AGP driver maintainer Dave Jones, with an introduction by last year's keynote speaker Andrew Morton as per a longstanding OLS tradition.

Jones' keynote was entitled "The need for better bug reporting, testing, and tools".

Morton began his light-hearted introduction by commenting that Red Hat has a lot of world class engineers.. and they also employ Dave Jones.

Jones, he said, joined the kernel development team in 1996 and currently maintains the AGP drivers in the kernel source tree. He's also the main Red Hat kernel guy, and with Red Hat's Linux market share being around or upwards of 50%, that makes Dave Jones an important player in the kernel community.

Morton commented that when the kernel 2.5 development tree was started, Jones volunteered to forward a number of bug-fix patches from kernel 2.4, but when he was ready to merge them in, they no longer fit. By the time he got them ready, the kernel 2.5 source tree had changed so much that the patches no longer lined up with the kernel source code.

Morton noted that a major crises had recently been averted. Through their leadership skills though, Red Hat's management had pulled Dave through e-mail's "darkest hour". Morton then read a post on the Linux Kernel Mailing List where someone had threatened to force Jones to be fired from Red Hat.

Jones began his talk by saying that when he accepted to give the keynote at the last OLS (where he was asked in front of all the attendees), he did not really know what he was going to talk about, and, he said, he did not figure it out until he moved to Westford, Massachusetts. He then put a photo up on the projector of a glacier with a distant person visible in it, with the heading "Westford, Massachusetts", where he began working with Red Hat's bugzilla bug tracking system.

Jones then cited his favourite quote ever, he said, posting:

"I don't think you have a future in the computing industry." - My CS teacher 15 years ago.

Jones noted that from kernels 2.6.0 through 2.6.7, new versions were being released approximately once a month, but since 2.6.7, new kernel versions are only being released about once every three months. The slower development cycle needs to get faster, he said.

From here Jones got into the meat of his address. Kernel upgrades, he noted, frequently break some drivers due to insufficient testing, citing the alsa sound driver's breakage in kernel 2.6.11 as a major example. He noted that it can be difficult to test every condition of every driver with every release. The AGP driver alone, he said, supports 50 different chip-sets, so a small change can cause problems which may not be found until the kernel is released.

Some patches, he confessed, are getting into the kernel with insufficient review as various sectional maintainers don't take the time to read over all of every patch.

What testing is done at the moment is stress, fuzz, and performance testing, leaving regression, error path, and code coverage testing not done with every release.

Many bugs are not found in the release candidate stage of kernel development as most users don't test the pre-release kernels, instead waiting for ostensibly stable kernels and then filing bug reports for things that could have been caught earlier if more people used the test releases.

Jones went into a long discussion on bug reporting and managing, concentrating on Bugzilla, which Red Hat uses for its bug tracking. He said that Bugzilla is not the be-all and end all of bug tracking, but that it is the best we have.

One day, when he was tired of looking at kernel code, Jones grepped the entire GNOME source tree for common bugs and found about 50 of them. He wrote patches for them all and went to see about submitting them. GNOME wanted each bug to be put into bugzilla as a new bug, with a patch provided. Not wanting to spend that much time on it, someone else went through the exercise for him and got the patches in.

It is important to understand bug submitter psychology, noted Jones. Everyone who submits a bug believes their bug is the most serious bug submitted. If someone has a bug that causes one serious problem, and someone else has a bug that causes a different serious problem, these are both serious problems to them, but for the bugged software, the bugs have to be prioritised. User-viewable priorities in bug systems don't help, as everyone sets the highest priority on their bug and the purpose of the system is negated. Jones suggested that in most bug systems, the priority is simply ignored.

Some users lie in their bug reports, said Jones, editing their output to hide kernel module loading messages that warn of the kernel being tainted. Without the information, the bugs can be a lot more difficult to solve.

Many users refuse to file bugs upstream of their distributions, blaming bugs on the distributions instead of the kernel. Some users even change distributions to avoid bugs, and soon find the bug appears in their new distributions as well when they upgrade the kernels.

Other bug reporters he described as hit-and-run bug reports. The reporters do not answer questions about the bugs that would help solve it, and the bugs eventually get closed as having insufficient information. Once the bug is closed, the original reporter will often reopen it and be irate that it was closed instead of solved, in spite of their lack of cooperation in getting the information together to help solve it.

Some people submitting bugs include massive amounts of totally irrelevant information, sometimes including the store where they bought their computer or how much they paid for it. Some include thousands of lines of useless log output with only a short, unhelpful description of the problem they are reporting.

A particularly annoying breed of bug reporters is the type that will submit a bug report against an obsolete version of the kernel and refuse to upgrade to a version that has fixed the issue.

The last type of difficult bug reporter Jones described is what he called the "fiddler". These people start adjusting various things trying to get their system to work around the bug they are reporting, to no avail. When a new version of the kernel is released with the bug fixed, it still does not work for them because of all the other changes they made trying to get it to work, though it may start randomly working again with a later upgrade.

Jones said he hopes that future versions of Bugzilla are capable of talking to each-other, allowing different Bugzilla deployments to exchange bugs up or downstream as appropriate.

Many bugs, he said, are submitted to distributors but are never passed upstream to the kernel team to actually be addressed, while other bug reports exist in several different places at once.

The last thing he had to say about Bugzilla and its implementation at Red Hat is a pattern observation he has made.

Use of binary-only kernel modules has dropped off significantly, from several bug reports relating to them per week to only a few per month. However, use of binary-only helpers - driver wrappers that allow Windows drivers to run hardware under Linux - is up.

Jones commented that times have changed since he first joined the kernel at version 2.0.30. The kernel is much more complicated than it used to be to learn. It used to be possible to get up to speed on kernel development fairly quickly, while it now can take a long time to learn the ropes and get used to the kernel.

He went onto discuss valgrind, gcc -DFORTIFY_SOURCE, and other approaches to finding and disposing of bugs in the kernel before moving on to a question and answer session with the packed room.

Among the questions asked was whether a distributed computing model could be used to help find and solve bugs, in the same way SETI@Home works. Jones did not think this would be a practical solution, noting that were bugs to be found, there was a good chance it would take down the host system and not actually get as far as reporting the bug back to the coordinating server.

If you ever meet Dave Jones, be sure to ask him about monkeys and spaceships.

Following Jones' keynote address, a series of announcements were made by an OLS organiser thanking the corporate sponsors for their continued support and thanking attendees for not getting arrested this year, among other things.

The final announcement was the selection of next year's keynote speaker: the energetic Greg Kroah-Hartman.

Why some run Linux

I have to take exception to a comment Andy Oram made about the first day of this year's OLS in an article on onlamp.com, where he commented that "some attendees see Linux as something to run for its own intrinsic value, rather than as a platform for useful applications that can actually help people accomplish something" in response to some derogatory comments about OpenOffice.org's memory usage. The Ottawa Linux Symposium is a conference of kernel-space, not user-space developers who do absolutely only see Linux for its intrinsic value in many cases. It is precisely because of this micro-focused engineering perspective that Linux is as good as it is. If you are looking for a conference where the attendees are looking for practical uses of software for general users outside of the operating system itself's development, the Desktop Summit held here in Ottawa, or any of the many Linux conferences around the world, is likely to be a better option.

In the end, OLS is all about sharing knowledge. Senior kernel developers walk around, indistinct from those who've submitted one small patch. There are no groupies, no gawking at the community figures walking around... it's just a conference of a group of developers and interested parties, each one of them both knowing something that they can share, and intending to learn something they did not already know. It is what a conference should be.

I'd like to congratulate the organisers on seven years of a well-organised, well-sized conference which has a schedule appropriate to the people attending (no conference like this would ever dare start its sessions at 8:30 in the morning!) and I look forward to returning in future years.

Of the 96 sessions and formalised events scheduled for this year's Linux symposium, I took 42 pages of hand-written notes from attending 23 sessions, and of those, I covered 15 in these summaries. I hope you enjoyed the small sample of this conference I was able to offer.

Category:

  • Linux
Click Here!