July 21, 2006

Day two at OLS: Why userspace sucks, and more

Author: David 'cdlu' Graham

OTTAWA -- Day two of the eighth annual Ottawa Linux Symposium (OLS) was more technical than the first. Of the talks, the discussions on the effects of filesystem fragmentation, using Linux to bridge the digital divide, and using Linux on laptops particularly caught my attention, but Dave Jones' talk titled "Why Userspace Sucks" really stole the show.

The first of these talks, "The Effects of Filesystem Fragmentation," was led by Ard Biesheuvel, a research scientist who works on Personal Video Recorders (PVR) in the Storage Systems & Applications group of Philips Research. Biesheuvel explained that a PVR operates by recording a television signal to a box, and employes metadata to describe what is available. It has some degree of autonomy in what it does, and does not, record by creating a profile of what the user likes to watch, or recording something that a friend's PVR is recording. It records a lot, and it can often record more than one TV show at a time.

With the PVR explained as the demonstration platform, Biesheuval's talk carried on to filesystem fragmentation. Biesheuval says that the theory is that fragmentation is generally expressed as a percentage, but a percentage is not clear. A new metric must be created for determining the impact of filesystem fragmentation. A useful metric is relative speed.

Biesheuvel showed a slide of a diagram of a hard drive platter. It showed how data is stored on tracks -- rings of data around the platter -- and each track is offset from the next by an amount appropriate for allowing the disk head to leave one track and get to the next, arriving at the right point to continue.

A gap, he explained, is the space between segments of a file not belonging to the file. Fragments are the non-contiguous pieces of the same file. Hard drives generally handle small gaps by reading through the data on the same track through the gap, while on larger gaps the drive head will seek (travel) to the track of the next fragment and then read it. Ideally, he says, there will be one seek and one rotation of the drive per track of data belonging to the file being read.

With the background explained, he described the tools for his tests. The first, called pvrsim, operates by simulating a PVR. It writes files between 500MB and 5GB in size to disk, two at a time, endlessly emulating the life-cycle of a PVR. It deletes recordings as space is needed for new ones by a weighted popularity system.

The next tool is called hddfragchk, which is not yet available for download, but Biesheuval says it will be made available eventually. The hddfragchk utility shows the hard drive as a diagram of tracks with the data from each file assigned a color. He demonstrated animated GIFs of hddfragchk in operation, showing the progression of the filesystem fragmentation as pvrsim runs.

The first filesystem was XFS, which showed clear color lines with small amounts of fragmentation visible as the files moved around the disk in the highly accelerated animation. The other filesystem he showed was NTFS, which resembled static as you might see on a television that is not receiving signal, as the filesystem allocated blocks wherever it could find room without much apparent planning.

Biesheuvel then went on to show a graph showing an assortment of filesystems and their speed of writing over time. All filesystems showed a decline over time, with some being worse than others, though I did not manage to scribble down the list of which was which.

Relative speed is highly filesystem dependant, he concluded. Filesystems should maintain the design principle that a single data stream should stick to its own extent, while multiple data streams must each be separately assigned their own extents.

Extents were not explicitly explained during the talk, it can be deduced from the discussion that they are sections of the filesystem pre-allocated to a file. He expressed optimal hard drive fragmentation performance mathematically, and stated that equilibrium is achieved when as many fragments are removed as are created.

Biesheuval also says that there is a sweet spot in fragmentation prevention with a minimum guarantee of five percent free space. At five percent free space, fragmentation is reduced. Ultimately, he says, relative speed is a useful measure of filesystem fragmentation. The worst filesystem performers do not drop below 60% of optimal speed.

Why userspace sucks

Dave Jones, maintainer of the Fedora kernel, gave his "Why Userspace Sucks - (Or, 101 Really Dumb Things Your App Shouldn't Do)" talk in the afternoon for a standing-room only crowd. Jones' talk focused on his efforts at reducing the boot time in Fedora Core 5 (FC5), and the shocking discoveries he made along the way.

He started his work by patching the kernel to print a record of all file accesses to a log to look for waste. He found that, on boot, FC5 was touching 79,000 files and opening 26,000 of them. On shutdown, 23,000 files were touched, of which 7,000 were opened.

The Hardware Abstraction Layer (HAL) tracks hardware being added and removed from the system, to allow desktop apps to locate and use hardware. Jones says that HAL takes the approach "if it's a file, I'll open it." HAL opened and reread some XML files as many as 54 times, he found. CUPS, the printer daemon, performed 2,500 stat() calls and opened 500 files on startup, as it checked for every printer known to man.

X.org also goes overboard, according to Jones. Jones showed that X.org scans through the PCI devices in order of all potential addresses, followed by seemingly random addresses for additional PCI devices, before starting over and giving up. He paid special attention to X fonts, noting that he found that X was opening a large number of TrueType fonts on his test system.

To see what it was up to, he installed 6,000 TrueType fonts. Gnome-session, he found, touched just shy of 2,500 of them, and opened 2,434 fonts. Metacity opened 238, and the task bar manager opened 349. Even the sound mixer opened 860 fonts. The X font server, he found, was rebuilding its cache by loading every font on the system. He described the font problems as bizarre.

The next aspect of his problem identification was timers. The kernel sucks too, he said: USB fires a timer every 256 milliseconds, for example. The keyboard and mouse ports are also polled regularly, to allow support for hot-pluggable PS/2 keyboards and mice. And the little flashing cursor in the console? Yes, its timer doesn't stop when X is running, so the little console cursor will continue to flash, wasting a few more CPU cycles.

Jones says that you don't need the patched kernel and tools that he used to do the tests. Using strace, ltrace, and Valgrind is plenty to do the work to get rid of waste, says Jones.

An audience member asked, after fixing all these little issues, how much time is saved? Jones replied that roughly half the time wasted by unnecessary file access was saved. However, the time saved is taken up by new features and applications that also consume system resources. As a result, says Jones, it is necessary to do this kind of extensive testing regularly.

Another attendee asked, how can we avoid these problems on an on-going basis? One suggestion is to have users who don't program, but wish to be involved in improving Linux, take on the testing work. The last question of the question-and-peanut gallery answer session at the end of the talk asked if KDE was as bad as GNOME in these tests. Jones replied that he had not tried.

As the Q&A continued, the session became more of a Birds of a Feather (BoF) than a presentation. The back-and-forth between Jones and the audience had most of the packed room in stitches most of the way through.

Bridging the digital divide

In the evening, I attended a BoF session run by David Hellier, a research engineer at the Australian Commonwealth Scientific and Research Organization (CSIRO) on the topic of bridging the digital divide. His essay on the topic won him an IBM T60 a day earlier.

Hellier says he would like to use Linux and Open Source to help bring education to the millions of extremely poor people throughout the world. In Africa alone, 44 million primary aged children cannot get a basic education.

A participant mentioned that there are 347 languages in the world which more than a million people speak, not all of which have translations of software, though some even smaller ones have translated versions of Linux. Another person pointed out that translating an operating system and applications is only part of the battle. The important part is translating the general knowledge associated with it. Tools that are translated must also be available off line. Remote, poor communities are unlikely to have much in the way of Internet access even if they are lucky enough to have electricity.

Linux developers, Hellier says, are largely employed by big companies. As such, they are in a position to suggest ways to get their companies to help close this digital divide.

How is it different from missionary work, one person asked, to send people with these unfamiliar tools to the depths of the developing world? Hellier responded that the key difference is that governments all over the world are screaming for all the help they can get.

Major software companies are going to the developing world to evangelize their wares, however, and it is important to counteract this effect. The ultimate goal is to help people help themselves, noted Hellier.

The discussion moved on to ask how to address this topic on a more regular basis than at conferences once every year or two in a BoF session. Hellier started a wiki for discussion on bridging the digital divide prior to the start of the session at olsdigitaldivide.wikispaces.com and it was suggested that an IRC channel be created for further discussion, a method, noted an audience member, used successfully by kernel developers for years; so an IRC channel, #digitaldivide was created on irc.oftc.net.

Hellier also recommended looking at a number of tools, including the Learning Activity Management System, Moodle, and the sysadmin-free usability of Edubuntu.

Linux on the laptop

The last session I attended yesterday was the BoF session run by Patrick Mochel of Intel on the topic of Linux on the laptop. It was an open BoF with no specific agenda and no slides. Mochel noted the presence of several relevant people to the discussion, including some developers of HAL, udev, the kernel, ACPI, and Bluetooth.

The discussion began with talk about suspend and resume support on recent laptops and the weaknesses therein. Mochel noted that while suspend and resume support is a nice thing, it does not buy you anything with the most critical aspect of a laptop -- battery life. This brought about a lengthly discussion of various things that waste electricity in a laptop. The sound device, for example, should be disabled when it is not being actively written to and network devices that are not being used should be disabled to conserve power.

The discussion evolved quickly, turning next to network states. It is possible, argued Mochel, to have the network device down until a cable is plugged into it, in the case of wired networking, and only come up when a cable-connected interrupt is received. This can be important because a network card that is on is wasting power if it is not connected to a network.

Removing a kernel module does not necessarily reduce power to a device, someone noted. Fedora only removes modules when suspend cannot be achieved without doing so, commented another.

Another participant asked whether there's any documentation on how drivers should work with regards to power management? The answers were less than straightforward, with one person asking if there's documentation on how drivers should work for anything at all. Another suggested posting a patch to the Linux kernel mailing list and seeing the reaction.

The topic of tablet PCs and rotating touch screens was brought up. Touch screen support has been improving over the last few years, it was noted, but mainly in userland. Someone commented that the orientation of the rotating monitors on tablets are determined by differential altimeters sensing air pressure differences between the ends and determining orientation as a result.

Rotating screens are not only a problem for X, says Linux International's Jon 'maddog' Hall, but for consoles as well. Pavel Machek replied that 2.6.16 and newer kernels allow command line tools to rotate the console.

The discussion then moved into a discussion of biometrics in light of the finger print scanner present on many newer IBM laptops. Microsoft, came a comment, is pushing for a biometric API in its next version of Windows. A biometric API exists for Linux, and sort of works. It supports the fingerprint scanner by comparing the image taken by the scanner to ones stored, a solution noted by others present to be less than secure since the image is not hashed -- something that has been done for user passwords on Linux for years.

The second of four days of the conference saw more technical talks than the first, with Dave Jones' talk on userspace being the highlight of the day.


  • News
Click Here!