Linux.com

Feature: News

Day two at OLS: Why userspace sucks, and more

By David 'cdlu' Graham on July 21, 2006 (8:00:00 AM)

Share    Print    Comments   

OTTAWA -- Day two of the eighth annual Ottawa Linux Symposium (OLS) was more technical than the first. Of the talks, the discussions on the effects of filesystem fragmentation, using Linux to bridge the digital divide, and using Linux on laptops particularly caught my attention, but Dave Jones' talk titled "Why Userspace Sucks" really stole the show.

The first of these talks, "The Effects of Filesystem Fragmentation," was led by Ard Biesheuvel, a research scientist who works on Personal Video Recorders (PVR) in the Storage Systems & Applications group of Philips Research. Biesheuvel explained that a PVR operates by recording a television signal to a box, and employes metadata to describe what is available. It has some degree of autonomy in what it does, and does not, record by creating a profile of what the user likes to watch, or recording something that a friend's PVR is recording. It records a lot, and it can often record more than one TV show at a time.

With the PVR explained as the demonstration platform, Biesheuval's talk carried on to filesystem fragmentation. Biesheuval says that the theory is that fragmentation is generally expressed as a percentage, but a percentage is not clear. A new metric must be created for determining the impact of filesystem fragmentation. A useful metric is relative speed.

Biesheuvel showed a slide of a diagram of a hard drive platter. It showed how data is stored on tracks -- rings of data around the platter -- and each track is offset from the next by an amount appropriate for allowing the disk head to leave one track and get to the next, arriving at the right point to continue.

A gap, he explained, is the space between segments of a file not belonging to the file. Fragments are the non-contiguous pieces of the same file. Hard drives generally handle small gaps by reading through the data on the same track through the gap, while on larger gaps the drive head will seek (travel) to the track of the next fragment and then read it. Ideally, he says, there will be one seek and one rotation of the drive per track of data belonging to the file being read.

With the background explained, he described the tools for his tests. The first, called pvrsim, operates by simulating a PVR. It writes files between 500MB and 5GB in size to disk, two at a time, endlessly emulating the life-cycle of a PVR. It deletes recordings as space is needed for new ones by a weighted popularity system.

The next tool is called hddfragchk, which is not yet available for download, but Biesheuval says it will be made available eventually. The hddfragchk utility shows the hard drive as a diagram of tracks with the data from each file assigned a color. He demonstrated animated GIFs of hddfragchk in operation, showing the progression of the filesystem fragmentation as pvrsim runs.

The first filesystem was XFS, which showed clear color lines with small amounts of fragmentation visible as the files moved around the disk in the highly accelerated animation. The other filesystem he showed was NTFS, which resembled static as you might see on a television that is not receiving signal, as the filesystem allocated blocks wherever it could find room without much apparent planning.

Biesheuvel then went on to show a graph showing an assortment of filesystems and their speed of writing over time. All filesystems showed a decline over time, with some being worse than others, though I did not manage to scribble down the list of which was which.

Relative speed is highly filesystem dependant, he concluded. Filesystems should maintain the design principle that a single data stream should stick to its own extent, while multiple data streams must each be separately assigned their own extents.

Extents were not explicitly explained during the talk, it can be deduced from the discussion that they are sections of the filesystem pre-allocated to a file. He expressed optimal hard drive fragmentation performance mathematically, and stated that equilibrium is achieved when as many fragments are removed as are created.

Biesheuval also says that there is a sweet spot in fragmentation prevention with a minimum guarantee of five percent free space. At five percent free space, fragmentation is reduced. Ultimately, he says, relative speed is a useful measure of filesystem fragmentation. The worst filesystem performers do not drop below 60% of optimal speed.

Why userspace sucks

Dave Jones, maintainer of the Fedora kernel, gave his "Why Userspace Sucks - (Or, 101 Really Dumb Things Your App Shouldn't Do)" talk in the afternoon for a standing-room only crowd. Jones' talk focused on his efforts at reducing the boot time in Fedora Core 5 (FC5), and the shocking discoveries he made along the way.

He started his work by patching the kernel to print a record of all file accesses to a log to look for waste. He found that, on boot, FC5 was touching 79,000 files and opening 26,000 of them. On shutdown, 23,000 files were touched, of which 7,000 were opened.

The Hardware Abstraction Layer (HAL) tracks hardware being added and removed from the system, to allow desktop apps to locate and use hardware. Jones says that HAL takes the approach "if it's a file, I'll open it." HAL opened and reread some XML files as many as 54 times, he found. CUPS, the printer daemon, performed 2,500 stat() calls and opened 500 files on startup, as it checked for every printer known to man.

X.org also goes overboard, according to Jones. Jones showed that X.org scans through the PCI devices in order of all potential addresses, followed by seemingly random addresses for additional PCI devices, before starting over and giving up. He paid special attention to X fonts, noting that he found that X was opening a large number of TrueType fonts on his test system.

To see what it was up to, he installed 6,000 TrueType fonts. Gnome-session, he found, touched just shy of 2,500 of them, and opened 2,434 fonts. Metacity opened 238, and the task bar manager opened 349. Even the sound mixer opened 860 fonts. The X font server, he found, was rebuilding its cache by loading every font on the system. He described the font problems as bizarre.

The next aspect of his problem identification was timers. The kernel sucks too, he said: USB fires a timer every 256 milliseconds, for example. The keyboard and mouse ports are also polled regularly, to allow support for hot-pluggable PS/2 keyboards and mice. And the little flashing cursor in the console? Yes, its timer doesn't stop when X is running, so the little console cursor will continue to flash, wasting a few more CPU cycles.

Jones says that you don't need the patched kernel and tools that he used to do the tests. Using strace, ltrace, and Valgrind is plenty to do the work to get rid of waste, says Jones.

An audience member asked, after fixing all these little issues, how much time is saved? Jones replied that roughly half the time wasted by unnecessary file access was saved. However, the time saved is taken up by new features and applications that also consume system resources. As a result, says Jones, it is necessary to do this kind of extensive testing regularly.

Another attendee asked, how can we avoid these problems on an on-going basis? One suggestion is to have users who don't program, but wish to be involved in improving Linux, take on the testing work. The last question of the question-and-peanut gallery answer session at the end of the talk asked if KDE was as bad as GNOME in these tests. Jones replied that he had not tried.

As the Q&A continued, the session became more of a Birds of a Feather (BoF) than a presentation. The back-and-forth between Jones and the audience had most of the packed room in stitches most of the way through.

Bridging the digital divide

In the evening, I attended a BoF session run by David Hellier, a research engineer at the Australian Commonwealth Scientific and Research Organization (CSIRO) on the topic of bridging the digital divide. His essay on the topic won him an IBM T60 a day earlier.

Hellier says he would like to use Linux and Open Source to help bring education to the millions of extremely poor people throughout the world. In Africa alone, 44 million primary aged children cannot get a basic education.

A participant mentioned that there are 347 languages in the world which more than a million people speak, not all of which have translations of software, though some even smaller ones have translated versions of Linux. Another person pointed out that translating an operating system and applications is only part of the battle. The important part is translating the general knowledge associated with it. Tools that are translated must also be available off line. Remote, poor communities are unlikely to have much in the way of Internet access even if they are lucky enough to have electricity.

Linux developers, Hellier says, are largely employed by big companies. As such, they are in a position to suggest ways to get their companies to help close this digital divide.

How is it different from missionary work, one person asked, to send people with these unfamiliar tools to the depths of the developing world? Hellier responded that the key difference is that governments all over the world are screaming for all the help they can get.

Major software companies are going to the developing world to evangelize their wares, however, and it is important to counteract this effect. The ultimate goal is to help people help themselves, noted Hellier.

The discussion moved on to ask how to address this topic on a more regular basis than at conferences once every year or two in a BoF session. Hellier started a wiki for discussion on bridging the digital divide prior to the start of the session at olsdigitaldivide.wikispaces.com and it was suggested that an IRC channel be created for further discussion, a method, noted an audience member, used successfully by kernel developers for years; so an IRC channel, #digitaldivide was created on irc.oftc.net.

Hellier also recommended looking at a number of tools, including the Learning Activity Management System, Moodle, and the sysadmin-free usability of Edubuntu.

Linux on the laptop

The last session I attended yesterday was the BoF session run by Patrick Mochel of Intel on the topic of Linux on the laptop. It was an open BoF with no specific agenda and no slides. Mochel noted the presence of several relevant people to the discussion, including some developers of HAL, udev, the kernel, ACPI, and Bluetooth.

The discussion began with talk about suspend and resume support on recent laptops and the weaknesses therein. Mochel noted that while suspend and resume support is a nice thing, it does not buy you anything with the most critical aspect of a laptop -- battery life. This brought about a lengthly discussion of various things that waste electricity in a laptop. The sound device, for example, should be disabled when it is not being actively written to and network devices that are not being used should be disabled to conserve power.

The discussion evolved quickly, turning next to network states. It is possible, argued Mochel, to have the network device down until a cable is plugged into it, in the case of wired networking, and only come up when a cable-connected interrupt is received. This can be important because a network card that is on is wasting power if it is not connected to a network.

Removing a kernel module does not necessarily reduce power to a device, someone noted. Fedora only removes modules when suspend cannot be achieved without doing so, commented another.

Another participant asked whether there's any documentation on how drivers should work with regards to power management? The answers were less than straightforward, with one person asking if there's documentation on how drivers should work for anything at all. Another suggested posting a patch to the Linux kernel mailing list and seeing the reaction.

The topic of tablet PCs and rotating touch screens was brought up. Touch screen support has been improving over the last few years, it was noted, but mainly in userland. Someone commented that the orientation of the rotating monitors on tablets are determined by differential altimeters sensing air pressure differences between the ends and determining orientation as a result.

Rotating screens are not only a problem for X, says Linux International's Jon 'maddog' Hall, but for consoles as well. Pavel Machek replied that 2.6.16 and newer kernels allow command line tools to rotate the console.

The discussion then moved into a discussion of biometrics in light of the finger print scanner present on many newer IBM laptops. Microsoft, came a comment, is pushing for a biometric API in its next version of Windows. A biometric API exists for Linux, and sort of works. It supports the fingerprint scanner by comparing the image taken by the scanner to ones stored, a solution noted by others present to be less than secure since the image is not hashed -- something that has been done for user passwords on Linux for years.

The second of four days of the conference saw more technical talks than the first, with Dave Jones' talk on userspace being the highlight of the day.

Share    Print    Comments   

Comments

on Day two at OLS: Why userspace sucks, and more

Note: Comments are owned by the poster. We are not responsible for their content.

Scratch-centric design.

Posted by: Anonymous Coward on July 22, 2006 05:49 AM
"Why userspace sucks

Dave Jones, maintainer of the Fedora kernel, gave his "Why Userspace Sucks - (Or, 101 Really Dumb Things Your App Shouldn't Do)" talk in the afternoon for a standing-room only crowd. Jones' talk focused on his efforts at reducing the boot time in Fedora Core 5 (FC5), and the shocking discoveries he made along the way."

Proably because F/OSS is more concerned with "scratching their itches", than user-centric design.

#

Re:Scratch-centric design.

Posted by: Anonymous Coward on July 22, 2006 06:57 AM
If you talk to new Linux users, yes they may want a faster boot up. However, if you talk to the most hardcore programming Linux geeks you can imagine, they would be very concerned about such a huge level of bloat (relatively speaking). I would not be suprised if Linux using big-business got involved to streamline HAL and other such processes, since large levels of disk access can only be a bad thing, especially for embedded systems.

#

Re:Scratch-centric design.

Posted by: Jeremy Akers on July 22, 2006 08:41 AM
Quite the contrary.

If you actually read the article, you'll note that the applications that are causing the problems, are applications that have received heavy modifications recently due to user demand. In other words, the features that were added were due to user centric design, not in spite of it. Developers are struggling to make F/OSS so user friendly, that they are starting to sacrifice performance for features.

In case you can't find the relevent pieces:

"The Hardware Abstraction Layer (HAL) tracks hardware being added and removed from the system, to allow desktop apps to locate and use hardware"<nobr> <wbr></nobr>...

"CUPS, the printer daemon, performed 2,500 stat() calls and opened 500 files on startup, as it checked for every printer known to man."<nobr> <wbr></nobr>...

"Jones showed that X.org scans through the PCI devices in order of all potential addresses"<nobr> <wbr></nobr>...

"The keyboard and mouse ports are also polled regularly, to allow support for hot-pluggable PS/2 keyboards and mice"

Hardware support, automatic hardware detection, plug and play, etc. Features users have been screaming for. When we didn't have them, people screamed about F/OSS not being user centric. Now they're there, and because they cause a performance hit at boot up, again people like you are bitching because it's not user centric enough.

So go use Windows and stop reading Newsforge...<nobr> <wbr></nobr>:p
If you're not either paying for the software or contributing to it in one way or another, you're a freeloader. Nothing wrong with that, freeload all you want. Find a problem? File a bug report, get it fixed. Want a new feature? Fill out a feature request, you'd be suprised how often simple requests that do make sense will get added. But being a freeloader, and bitching about people not bending to your will is a little... selfish? rude? I dunno, you pick.

Not that I'm accusing anyone of being a freeloader here, but chances are...<nobr> <wbr></nobr>:p

-Jeremy

#

Re:Scratch-centric design.

Posted by: Anonymous Coward on July 22, 2006 09:37 AM
"So go use Windows and stop reading Newsforge...<nobr> <wbr></nobr>:p"

Is that the best you can come up with? I'll address that attitude in a minute, but first "performance" is a part of "user-centric" design. Not just ease-of-use, or looking pretty. It's been that way since Apple first made the GUI mainstream. Now as for the attitude. Need I point out that it's F/OSS that's coming to people hat in hand. Not the other way around. The real world already had their solutions before F/OSS even showed up. So F/OSS taking their ball and going home isn't going to make a lick of difference to the rest of the world.

#

Re:Scratch-centric design.

Posted by: Jeremy Akers on July 22, 2006 12:42 PM
"but first "performance" is a part of "user-centric" design"

Prove it. User centric design is whatever the users decide. Open source software always had (and still does have) a reputation for being lightwight and fast. Only because users have been screaming for these features have they been added. There is a cost to everything. You can't have it all for free.

"Need I point out that it's F/OSS that's coming to people hat in hand."

I see you pointing, but at what I'm not sure. You show me the hat or the hand. Until then, it's an empty claim.

"The real world already had their solutions before F/OSS even showed up."

HAHA. "The real world" Like this is an imaginary one? The problem with your argument is that before F/OSS showed up, people had to pay out the ass for those solutions, and they were stuck with what they got from their vendor. Now the solutions are free and you can customize them to your needs and even re-sell your customizations. Yet somehow that isn't good enough for you... Fine, go pay for the closed solution that you can't modify, we don't care, it's your loss.

"So F/OSS taking their ball and going home isn't going to make a lick of difference to the rest of the world."

Well I took the ball and went home, so why are you still following me?

-Jeremy

#

Re:Scratch-centric design.

Posted by: Anonymous Coward on July 22, 2006 08:09 PM
The flames are unfortunate, but I'm replying here as the parent nicely summarizes a problem I too have noticed.

It's not just daemons and X that have the problem, and GNOME isn't alone, either. Try strace -efile . Due to the various places distributions put various files and testing for system and user configs, it's not uncommon at all for an app to check ten or more different locations for a single config file or resource (icon, cursor, font, etc) multiplied by sometimes fifty or more such files! (As a former power user of over a decade of a rather less Freedom oriented OS, it reminded me of the hundreds of lines of registry accesses I used to look at.) Fortunately, on a conventional desktop, many of these are common to many apps, and after the first access, the file or lack of file will be cached so it's just a cache check, but opening those first applications that actually read the data from disk into cache can be<nobr> <wbr></nobr>/quite/ the killer, timewise.

Thinking about it, it's not realistic to expect all distributions to settle on a common location for everything, but I wonder why the first such file opened couldn't be in a standardized location, say<nobr> <wbr></nobr>/etc/fslocations and ~/.fslocations or similar, and list the distribution/local reference location for all these things. One reference location to look (well, one each for system and user), which would list the single location for each category of file. As KDE or GNOME or whatever installed, it would update this file with additional local paths (entries appropriately namespace organized so as not to conflict with each other) as appropriate. Only if this file didn't exist, couldn't be read (wrong permissions), or didn't have the appropriate entry would apps fall back to checking their dozen different locations for each file.

That would add a couple more lookups per app, but would subtract hundreds in many cases, perhaps thousands in some.

Even if this weren't to become a full community standard, individual toolkits/DEs/whatever could have their own standard. In that case, it might add a few more lookups as the index file was looked for in all the various possible locations, but it would still be one file's lookups, while subtracting the same lookups for often scores of other files.

As for CUPS and etc, what about having a single discovery-lockfile, again, one at the system level and another at the admin-level user level, that toggled discovery on and off. To add a piece of hardware or just get the flexibility of being able to do so, with the performance tradeoff that goes with the flexibility of course, toggle it on. Toggle it off for streamlined use when such detection isn't needed.

Obviously it's not quite that simple, but the idea seems workable enough.

Duncan

#

LWN article

Posted by: Anonymous Coward on July 23, 2006 12:01 AM
There's a fascinating <a href="http://www.lwn.net/" title="lwn.net">LWN</a lwn.net> article titled <a href="http://lwn.net/Articles/190222/" title="lwn.net">The 2006 Linux File Systems Workshop</a lwn.net> that goes into more detail about what is needed in future Linux filesystems. There are a few research papers mentioned; I found <a href="http://www.eecs.harvard.edu/margo/papers/sigmetrics97-fs/paper.pdf" title="harvard.edu">File System Aging - Increasing the Relevance of File System Benchmarks</a harvard.edu> the most interesting.

#

Why userspace sucks? Boot Up Only What's Known!

Posted by: Anonymous Coward on July 23, 2006 03:39 AM
Why are you checking for every piece of hardware, or device known to man on every BOOT anyway! Your boot sequence thought proccess is seriously flawed. Your treating even a completed HDD install of Linux as if you are booting a "LIVE CD". Only Boot to what is known, and configured previously on HDD installed systems. Add a "Scan For Hardware" (changes) menu option to Lilo/Grub. Then pass the "Scan For Hardware" app/script, on to the Desktop GUI people. They could then add to that, a graphical way to scan-for-hardware, and load/unload kernel modules.
That way, you see, the Linux Desktop GUI's can come up to speed with Windows 95 in this regard. Currently, users who can't/won't use the command line, are painted into a corner when devices aren't found at boot. This method fixes both problems, and should net you wildly quicker boots!

Cleaning up the wacky calls in the code is great work to do. But adding some real-world "common sense" to the boot proccess couldn't hurt either.

#

Re:Why userspace sucks? Boot Up Only What's Known!

Posted by: Anonymous Coward on July 24, 2006 02:47 AM
Now as an option this may be good but this FEATURE makes my life so easy.

For example last year I blew a big chunk on my motherboard. It was dead as a dodo. Got a new one put it in the box reconnected all the existing hardware and rebooted.

Staright from reboot I'm using the machine without any alterations. Purely because of this feature of the kernal checking the hardware on startup,

Same happened on a windows machine and it was a complete new install as it only runs on the one hardware setting (depandably). Oh and I am now a FILTHY CRIMINAL using pirated software, because the key check doesnt match.

#

Re:Why userspace sucks? Boot Up Only What's Known!

Posted by: Anonymous Coward on July 24, 2006 07:11 AM
"Now as an option this may be good, but this FEATURE makes my life so easy."

Is it worth five minutes of your life, every time you start your computer? Particularly when, when confronted with seldom ocurring events such as you describe, you need only to click on the Scan-For-Hardware menu option in Lilo/Grub on first boot. And, one of the obvious reasons it needs to be included as part of this specific solution.

#

This story has been archived. Comments can no longer be posted.



 
Tableless layout Validate XHTML 1.0 Strict Validate CSS Powered by Xaraya