May 17, 2007

LogFS: A new way of thinking about flash filesystems

Author: Joe 'Zonker' Brockmeier

Storage manufacturers are getting ready to start shipping solid state disks, and Linux-based devices like One Laptop per Child's XO and Intel's Classmate don't contain standard hard disks. To improve performance on the wide array of flash memory storage devices now available, project leader Jörn Engel has announced LogFS, a scalable filesystem specifically for flash devices.

Linux users already have two mature flash filesystems to choose from -- JFFS2 and YAFFS -- so why do they need another? Engel discounts both of these options, for different reasons. Engel says that YAFFS "has never made a serious attempt of kernel integration, which may disqualify it" for many potential users. At the same time, Engel says that memory consumption and mount time for JFFS2 are unacceptable on larger flash devices. "Unlike most filesystems, there is no tree structure of any sorts on the medium, so the complete medium needs to be scanned at mount time and a tree structure kept in-memory while the filesystem is mounted. With bigger devices, both mount time and memory consumption increase linearly."

How much storage can LogFS support? Engel says he's not sure. "Honestly, I don't know. All important data structures are 64-bit, so it is possible to work with exabyte devices as soon as they become available. Whether LogFS will really scale that far remains to be proven, though."

Realistically, Engel says, "I would imagine something like 64 megabytes to be a reasonable size." He notes that LogFS will work with smaller devices, but that JFFS2 may be a better choice for very small flash filesystems. LogFS "certainly works with as little as 2 megabytes, but JFFS2 has less overhead and the mount time problems don't matter much for such small devices.... JFFS2 remains a good choice for the devices it was once designed for."

Why not fix the problems with these filesystems, or work on making ext3 suitable for flash filesystems? Engel says he doesn't think that the existing flash filesystems could be improved. "Why else would I have started this project in the first place?"

The first problem with using ext3 for flash devices, Engel says, is the erase operation used on flash filesystems. He says that you can only write data on a "sector" of the flash device once before it needs to be erased, unlike hard disks:

After that a relatively large piece of flash needs to be erased. The size of these erase blocks differ -- it is usually between 16 and 128KB.

After this erase, all data is gone and cannot be recovered. So a flash filesystem has to make sure that no important data is in the area before it gets nuked. If there is, the filesystem has to move it elsewhere first.

Moving data elsewhere means there is no fixed relationship between the physical location of data on the device and the logical location of data in terms of file and file offset. Ext2/ext3 and most of the other disks filesystems depend on such a fixed relationship, so they don't work as flash filesystems.

Another problem with ext3, Engel says, is wear leveling. Flash memory has segments that can only have data erased a number of times before the segments become unreliable. Filesystems designed for standard hard disks are not optimized for flash memory, and would wear out segments of the device, which would effectively render the device unusable. Engel says flash-specific filesystems are necessary to make sure that a flash device has a longer life.

On the flip side, Engel says that LogFS wouldn't be well-suited for hard disks. "In most cases I would think not. Hard disks have a horrible habit of being slow. Average seek times are still in the 10ms ballpark. That means that filesystems that want to be fast on hard disks have to arrange their data in a way to minimize seeks.

"LogFS does not worry about this problem at all. As a side effect of its design, writes rarely cause seeks, so write performance can be quite good. But read performance on hard disks should be about the worst in any benchmark imaginable."

Not quite ready

At the moment, LogFS is not ready for production. The LogFS FAQ advises that users avoid using the filesystem with production data, and that it should be ready in "mid 2007."

Engel says that LogFS "survives all my test cases," but that his test cases don't cover a few specific areas that one might wish to be ready for, such as system crashes and errors when performing read/write/erase operations on the flash device. Engel says that system crashes are "the top item" on his to-do list for LogFS.

Engel says that error handling "may be a problem users can live with if their device is reliable enough. But that decision should not be taken lightly and [devices should be] properly tested first."

The scarcity of "real" flash devices is another problem, according to Engel. While there are plenty of flash memory devices available, Engel says that most consumer devices have "a layer between the raw flash and the filesystem to make their device behave as if it were a hard disk.

"In order to use a USB stick with a flash filesystem, the block device interface needs to get translated back to a Memory Technology Device, which represents a piece of flash in Linux. This double translation is hugely inefficient. If manufacturers would allow access to the raw flash chips in their devices without going through any translation, flash filesystems would become a lot more useful."

One project that LogFS may be ideal for is the One Laptop per Child (OLPC) project. Engel says that LogFS is only "loosely" affiliated with OLPC -- David Woodhouse, who wrote the original JFFS2 filesystem and has contributed to LogFS, is involved in OLPC through Red Hat. Engel says that Woodhouse introduced him to OLPC programmer Jim Gettys, and that he has since received two OLPC prototypes, which he uses as test hardware for LogFS.

"I would love to see my filesystem used on OLPC machines and believe it would be a nice improvement for them. But that largely depends on how fast it matures."

Click Here!