July 27, 2004

The Linux filesystem challenge

Author: Preston St. Pierre

Linux boasts the widest array of filesystem support among mainstream operating systems. However, Microsoft (with Longhorn) and Apple (with Tiger) have made it clear that they consider the filesystem of the future to be a database of information to be mined, and that client PCs will be a major part of the next chapter in the "search wars." The future of Linux may depend on whether Linux filesystems continue to innovate.

Why filesystems matter

The filesystem mediates between the operating system and the storage device, mapping what the operating system understands as directories and files onto what the device understands such as tracks and sectors. This seems like an essential but mundane function -- not one that has a major bearing on IT decision-making. However, anyone who has ever had to defragment a Windows disk, or watch fsck grind through a long recovery on a Linux ext2 disk partition, can appreciate how important the filesystem can suddenly become.

Filesystems have a major impact on how secure and reliable your data is, as well as how flexible your applications can be in interacting with that data. This latter point may not be obvious. Think, however, about Windows's expectation that files of a certain type must have a certain extension; absent that extension applications are often at a loss for how to handle the file. Of course there are application-level workarounds for these problems, but they point to a clear tension in application design: how much should the filesystem be doing to facilitate application execution, and how much should the application be compensating for functionality not in the filesystem?

When it comes to fault tolerance and data integrity, the industry consensus is that filesystems should do the heavy lifting. This was a major challenge for Linux in the transition from the 2.2 to the 2.4 series of kernels.

By late 2000 Linux had become popular for low-end server systems. The low cost of the operating system, the low cost of the hardware needed to run it, and the relatively high performance made Linux compelling. What you did not find in late 2000, however, were signficant Linux deployments for mission-critical applications. One of the biggest limitations was the filesystem.

Proprietary versions of Unix all offered journaling filesystems: filesystems that not only mediated between operating system and storage media, but kept a log of mediation activity for rapid recovery in the event of a system crash. Notable among these were IBM's JFS and SGI's XFS. Linux still relied on ext2, which achieved high performance in terms of speed but suffered from a slow and sometimes painful recovery process.

Ext2 made Linux a great choice as a Web server. Having one of many mirrored Web heads behind a site go down and come back up slowly was an acceptable cost if the upside was very fast performance in serving flat-file Web pages. But ext2 severely limited Linux's suitability for mission-critical environments.

The release of the 2.4 kernel in January 2001 was a watershed moment for Linux. Linux gained a native journaling filesystem in ext3, and both JFS and XFS were supported options for 2.4 kernels as well. With other improvements in fault tolerance and scalability, Linux could take on an ever larger server role in the enterprise.

Still, issues of data integrity, recovery, and fault tolerance remained. These are the very same issues that arise in the world of databases and database application development. The parallel shouldn't be surprising. One can argue that a filesystem is nothing but another kind of database.

Linux filesystems today

In the enterprise, Linux is viewed primarily as a server operating system. Not surprisingly, then, filesystem innovation has been driven by server needs. The performance and fault tolerance that come with a journaling filesystem were the earliest need. There's a good technical comparison of journaling filesystems in Linux Gazette.

Work has progressed more slowly on incorporating attributes into filesystems. Attributes are short name-value pairs that are associated with each file; familiar stuff to anyone from the database world. "Phone number," "email," and "mime type" are examples of entities that could be attributes. Attributes help a filesystem present its structure to the operating sytem in a rich and meaningful way.

The search wars

The early days of the Web saw fierce competition between search engines: Alta Vista, Lycos, Magellan, Inktomi, all strove to dominate the search market. Seemingly out of nowhere, Google emerged as the clear winner. Google's ascendance signaled not the end of the "search wars", but rather the beginning.

Microsoft realizes that it has so far lost the search war, much as it lost the early stages of the browser war. To make a comeback, it must find and dominate an area of search technology where Google is not already entrenched. It has chosen the desktop, an area about which many users are asking these days, "Why is it easier to search the Internet than my hard drive?"

Microsoft will try to leverage its ability to tweak the operating system (hence WinFS) to become a leader in desktop search capability. It can then couple that with its online presence to offer unified search. Google already dominates online search, and will have to find an application-level solution to extend its capabilities to the desktop.

This competition has broad implications for the enterprise. In any large company the thousands of enterprise desktops house valuable data. Any software that figures out how to make that data easy to retrieve will be a compelling choice for the enterprise desktop.

There are implications beyond the desktop, however. Think about why Oracle urges deploying its database configured for raw disk I/O. Such a configuration increases performance because it enables Oracle to have its database function as a filesystem. If Microsoft can enable its filesystem to function as a database, then at least on small to midsized applications SQL Server may be able to compete with Oracle as never before.

In an interview to New Scientist, IDC operating system analyst Dan Kuznetsky says, "A number of people have started to say we need to use the technology developed for databases and Web searching and use them for the filesystem."

Where does Linux stand in all of this? If Linux is really to compete on the desktop, and if Linux is to advance its hold in the server space, then it must enter the search wars, and do so at the filesystem level.

The first serious effort to incorporate attributes with Linux came from the appliction side, not the filesystem side, and not surprisingly it came from someone with a long history at Apple: AndyHertzfeld. Hertzfeld's Eazel brought us the Nautilus file browser, an elegant addition to the Linux desktop interface that attempted to deliver many of the benefits of an attributed filesystem: automatic viewing/previewing of file contents, attribute-based rather than hierarchical folders, intelligent recognition of file type for application handling.

Alas, Eazel was ahead of its time, and suffered the fate of many dot-coms. Nautilus, however, lives on as part of the GNOME Desktop.

The real future for Linux, though, depends on filesystem innovations that enable Linux to keep up or lead in the race with Longhorn and Tiger.

Longhorn, Microsoft's next generation operating system, expected in 2006, will include WinFS, a filesystem built on an object relational database structure. This will improve speed and stability, and also greatly facilitate search capability.

In Tiger, expected in the first half of next year, Apple will debut a new search technology called Spotlight. Not only will Spotlight speed searching, but it will return richer data about files it searches by "by indexing the descriptive informational items already saved within your files and documents called metadata."

The next-generation Linux filesystem should facilitate comparably functionality, rather than requiring applications to compensate for capabilities the filesystem lacks. There's some genuine awareness and discussion of this on the GNOME Desktop mailing list. The GNOME developers realize that they need attribute functionality in the filesystem, and that they need it on a time table that puts them ahead of the WinFS release in Longhorn.

Linux already has a viable next-generation filesystem candidate in ReiserFS. ReiserFS is not just a journaling filesystem, but one that uses an innovative database structure (so-called "dancing trees"). While ReiserFS does not have a native concept of attributes per se, its ability to handle lots of small files with negligible performance hits means that all the metadata functionality we associate with attributes can be built in.

For now the emphasis is on "can be." This is a clear direction in which Linux is moving, but we're not there yet.

Looking to the future

All indications are that Linux, Windows, and Mac OS are moving in a common direction with filesystem innovation. Linux's continued success depends on who gets there first, and how the market reacts to the Linux approach.

Much also depends on what happens competitively within the Linux market. Right now, more real innovation seems to be coming from Novell/SUSE rather than Red Hat. Novell's Miguel de Icaza and Nat Friedman have been very clear about the competitive challenge presented by Longhorn. SUSE already ships with ReiserFS as the default filesystem (Red Hat defaults to ext3).

Linux is a ways yet from having a fully attributed, database-driven, journaling filesystem. The direction of future development looks promising, though. Linux will certainly compete as the search wars come to the desktop. Linux's value to the enterprise depends on it.

Click Here!