Using Flash Memory with uClinux

536

Author: JT Smith

– By
Greg Ungerer, Snap Gear

This paper is a discussion of theory and methods for building uClinux systems that boot, run and operate using Flash memory. For the most part this discussion is processor independent.

A brief introduction to Flash is in order first, followed by a description of
the different ways that uClinux system components can be laid out in Flash.
I’ll then spend some time describing the kernel driver configuration and setup,
and talk a little about root filesystem choices.

Although aimed at using Flash memory, most of the layout ideas apply equally
to Flash and ROM. You just won’t have the ability to update it in circuit,
or to run a read/write filesystem with standard ROM devices.

I’ll wrap up with a real life example that should crystalize some of the
finer details. Hopefully by the end you will have an understanding of how to
build uClinux systems that take advantage of Flash Memory, and be aware of
some of the choices you can make, and the implications and trade-offs involved.

1. Flash Memory

For most of the previous 20 years, ROM (and EPROM) have been the mainstay
of non-volatile storage for embedded systems. But today the majority of
modern deeply embedded systems rely on Flash memory.

Flash memory primarily comes in two flavors, NOR and NAND (there are some
other more exotic varieties, AND for example, that I won’t discuss here).

Reading NOR flash is essentially like reading SRAM. You can read values from
random addresses, with the whole of its address space visible. You can execute
code directly from NOR Flash, since it looks like SRAM (this is often
referred to as Execute-In-Place or XIP). Because of this feature I think
it is fair to say that NOR Flash has been a more popular choice in small
embedded systems. All your code can potentially run directly from the Flash,
thus reducing your RAM requirements. In this respect NOR Flash can be used
as a replacement for standard ROM. There are quite a few manufacturers of
NOR Flash devices, including Intel, AMD, Fujitsu and Toshiba. NOR Flash
memory chip capacities typically range from a few kilobytes up to 64Mb.

NAND Flash is generally read a block at a time, and so is not like a typical
random access memory technology. It is more like a hard drive in nature
with block sizes usually of the order of 512 bytes in size. NAND Flash tends
to be cheaper per bit, but is prone to more errors at the bit level, so you
need software to handle bad blocks. You cannot execute code directly from
NAND Flash. One interesting variation using NAND Flash is M-System’s
DiskOnChip devices. They use NAND Flash internally and combine it with
some logic to handle error detection and correction, and also to simplify
external CPU access. Don’t be fooled by the name, the interface to these
devices is nothing like a hard disk drive. Manufacturers of NAND Flash devices
include Samsung and Toshiba – both of which supply NAND Flash technology to
M-Systems for DiskOnChip as well. NAND Flash device capacity typically ranges
from 8Mb up to 128Mb, with DiskOnChip going up to 1024Mb.

Writing to Flash Memory is not like writing to conventional RAM.
For both NOR and NAND Flash there is a sequence of steps needed to initiate
a write of data. A write will almost always involve an erase cycle on
some part of the Flash first.

All Flash memory types have a notion of segments or blocks (or more precisely
erase segments). This is the minimum size unit of the Flash device that can be
erased. You cannot erase a single byte or just a few bytes. The segment size
is very much device dependent. They tend to range from 8Kb up to 128Kb.
An erase operation will leave all Flash memory bits in the erase segment with
a logical value of “1”.

Writing of data is actually a process of either flipping a bit to “0” or
leaving it as a “1”. So although you can essentially write a value to any
address you cannot flip a bit that is “0” back to a “1” without an erase
cycle.

Thus generally when you want to change the contents of the Flash you need to
do an erase cycle followed by a write cycle. This has some interesting
implications, as we see when we look at read/write filesystems for Flash.

The life span of Flash memory is generally measured in terms of the number
of erase cycles that can be done. The exact number varies from device to
device, but typical ranges are from 10,000 to 1,000,000 erase cycles. As a rule
of thumb you do need to be careful about how often you erase/write a Flash
memory segment. This is particularly true when using a read/write filesystem
on top of your Flash.

I am largely going to concentrate on NOR Flash for most of this discussion.
It is far more common in smaller systems, and it is by far the most used type
of Flash memory in uClinux systems.

2. System

The amount of Flash and RAM contained in any embedded system will always
be a tradeoff. Often cost is the primary consideration, with Flash memory
generally being more expensive per Megabyte than RAM. Ultimately the
sizing will depend on your requirements and exact system details.

The Flash memory is obviously the device that holds the system code and
data and where execution will start on power up. It is the primary file
store for a uClinux device.

In its simplest form you can just put the uClinux kernel start code at the
processor start address in the Flash memory and just use the Flash
as a single large chunk of storage.

Alternatively, it is often convenient to partition the Flash into separate
regions. This notion is not unlike the partitions of a disk drive, although
on a simpler and smaller scale. A typical Flash partition arrangement might
be something like:

SEGMENT PURPOSE

0 boot loader
1 factory configuration
2
.
. kernel
.
X
.
. root filesystem
.
Y

This arrangement has allocated segment 0 to the boot loader code, segment 1
to store factory configuration (say ethernet MAC address, or kernel command
line arguments), segments 2 through X to the kernel, and finally segments
X through Y to the root filesystem.

This is a simple example. Clearly you could build a arbitrarily complex
set of partitions if that is what your system needs. As we will discuss later
on, the uClinux block devices that support Flash memory handle this notion of
partitioning. Note that the partitions are always a whole number of Flash
segments. You really want to do it this way since you can only erase an
entire segment at a time.

This example also separates the kernel and root filesystem into individual
partitions. This is not the way a normal Linux desktop or server system
would be set up. Typically they would have the kernel binary exist as a
normal file within the root filesystem. The problem with this is that you need
a very clever boot loader (something like LILO or GRUB) that is capable
of figuring out which blocks (that contain the kernel binary) of the hard
drive need to be loaded into RAM. There are two distinct advantages to placing
the kernel image into Flash as a contiguous blob. One is that you can now
run it directly from Flash (XIP.) The other is that it is trivial for a boot
loader to figure out where the kernel code is in Flash to execute it
(you may not even need a boot loader.)

There are several options for placement of the kernel and root
filesystem. The best choice very much depends on a number of trade-offs.
I’ll list some typical configurations here, and we can look at the advantages
and disadvantages of each.

  1. kernel at fixed offset, root filesystem at fixed offset
  2. kernel followed by root filesystem
  3. compressed kernel and root filesystem

Option (a) has the benefit of the key system components being at fixed
addresses. Easy for the boot loader to find the kernel, and easy for the
kernel to find the root filesystem. It will also be simple to upgrade the
kernel and filesystem independently of each other. The disadvantage is that
you will inevitably have some wasted Flash space between the end of the kernel
binary and the start of the root filesystem.

Option (b) saves some space by not partitioning the root filesystem separately.
It is also common with uClinux systems to combine the kernel binary and the
root filesystem into a single binary file. This implies that you have to
update the kernel and filesystem in a single pass (although often this is a
good thing).

Option (c) saves a lot of Flash space by compressing the kernel and root
filesystem. This option would require a boot loader that is capable of
uncompressing the image into RAM. So it will use more RAM, but has a
significantly smaller Flash footprint than of the other options. You could
also compromise here by just compressing the kernel or root filesystem if
that made sense.

There is certainly no reason you could not have multiple filesystem partitions
too. This may make sense if you anticipate only wanting to update a section
of the filesystem, or perhaps if you have a portion of your filesystem that
is read-only, and a partition that you want read/write.

3. To Boot Loader or not to Boot Loader

The first consideration when booting is CPU dependent. Where does the
processor start execution on power up? Many CPU’s start execution at a fixed
address (for example ARM, x86). Others read a fixed location in the address
space and use that value as the start address (m68k, ColdFire). This has a
direct impact on where you must put your system startup code in the Flash
memory.

At this stage it is worth considering whether you want to use a boot loader
in your system, or just start running the uClinux kernel.

There is no problem with the CPU startup just starting execution of the
kernel. Your uClinux startup code will have to do all hardware setup
(this usually includes setting up chip selects and RAM setup) and it will
have to load the kernel data segment in RAM and clear the bss segment.
But this is all very straightforward. The only difficulty with this scheme
is arranging the kernel code to be at the correct offset in the Flash
memory so that the CPU will start executing it on reset. For most modern
CPU’s that start executing at offset 0 or hold the start address near
offset 0, this is quite simple.

A boot loader is a small stand alone piece of code that usually does
basic hardware setup (again chip selects, RAM, etc) and then loads or
starts execution of the uClinux kernel proper. You can do some useful
things in a boot loader. It can prompt the user to load one of multiple
kernels present in the Flash, or it can load a kernel and system image
through some other I/O device (for example serial port or Ethernet port).

A boot loader can provide some level of protection against a broken or
corrupted kernel image. This can be important when used with Flash, for
it is possible that a Flash update of the kernel, or other critical data,
is incomplete (power failure in the middle of update, or worse, accidental
loading of bugged code). The boot loader, which can be locked permanently
into Flash Memory, can provide a recovery mechanism to fix this situation.

There are a number of boot loaders in use with uClinux today. Examples of
freely available loaders include CoLilo, My Right Boot (MRB), PPCboot and
Motorola dBUG. A number of companies have proprietary boot loaders for
specialized tasks, for example Snap Gear and Arcturus Networks.

4. uClinux Kernel Block Drivers

There are currently three choices for the block device that will contain
the root filesystem in uClinux:

  1. Blkmem driver
  2. MTD driver
  3. RAM disk driver

The blkmem driver is the oldest and may well still be the most common choice
for use with uClinux. It was specifically designed for uClinux, but it is
relatively simple and only supports a handful of common NOR Flash memory types,
as well as root filesystems in RAM. It is also difficult to configure,
requiring code modification of tables to set up your Flash partitioning.
However, it does provide basic support for erasing and writing Flash regions.

The Linux MTD drivers are the standard Flash memory drivers for Linux. They
support a huge variety of Flash devices, and offer powerful mechanisms for
defining partitions and mappings. For anything other than trivial
setups you create a map driver that defines your exact Flash layout. It can
span multiple Flash devices, with interleaving, and even different Flash
device types in the one system. There are a number of configuration options in
the Linux kernel config devoted to MTD setup. Use the on-line help to decide
on which options you need. It will also definitely help to look at an example
device that uses the MTD drivers to get started.

Thirdly, you can use the Linux RAM disk device. This is commonly used in
standard Linux for diskless booting. The RAM disk driver does not offer any
direct support for any underlying Flash memory, so it is only useful for
the purpose of holding the root filesystem. This may make sense on a system
where you store the root filesystem compressed in Flash.

The MTD drivers clearly provide the most powerful support for Flash. They
also allow you to run real read/write filesystems specifically designed for
Flash memory, such as JFFS and JFFS2. You cannot do this with the blkmem
driver.

5. Root Filesystem

There are a number of choices for root filesystem in uClinux.

Traditionally the ROMfs type has been the most commonly used. It is a simple,
compact, read-only filesystem. It stores all data of a file sequentially,
so it allows for application programs to be executed in place (XIP) in the
filesystem on uClinux targets that support this (which is true for m68k,
ColdFire and ARM). This can can make for a considerable reduction in
memory footprint for a running system.

Cramfs is a new filesystem for 2.4 series Linux kernels. It is designed to
be a compact read only filesystem. Its primary advantage is that it stores
all files compressed and decompresses them on the fly. Because it
store files compressed, you cannot run applications in place (no XIP).
It is quite space efficient in terms of Flash usage, but more RAM will be
required since all application code needs to be copied into RAM for execution.

Some systems will need a read/write root filesystem. By using the Linux
MTD drivers it is possible to run a journaled Flash filesystem like JFFS
or JFFS2 on top of Flash memory. Journaled filesystems are
safe from sudden power loss (that is an unclean shutdown condition), and
don’t require a filesystem check on the next boot up. Since the JFFS and
JFFS2 filesystems are specifically designed for use with Flash memory, they
also provide a feature called wear leveling. This is where the filesystem
code lays out data and updates it in such a way that all parts of the Flash
are erased a similar number of times. This can dramatically increase the
useful lifetime of Flash memory devices. JFFS2 has the distinct advantage
of storing files compressed, so uses much less Flash space. It should be used
in preference to the older JFFS. Something else to be mindful of when using a
journaled filesystem is that some small amount of Flash will be wasted for the
journal overhead and garbage collection system. This wasted space is typically
of the order of 2 Flash segments in size.

If you are using a RAM disk then it is common to use ext2 as the filesystem.
You can certainly do this with uClinux too. Ext2 is not particularly space
efficient, and being on a RAM disk any changes you make to it will be lost
on the next reboot (some consider this an advantage in the embedded space,
since you always start your system from a known filesystem state).

There are a number of other filesystems that you could use. Linux has a large
number to choose from! But those listed above are the most commonly used with
uClinux. There is no reason you couldn’t use an MS-DOS FAT type filesystem if
you really wanted to.

One more thing worth noting is the way in which a root filesystem is
constructed for use with uClinux. Usually the root filesystem for an embedded
device is constructed on the development host and then loaded into the target.
Typically you create a directory tree in your development environment that is
a mirror of what you want the final root filesystem to be, then use a host
based tool to construct a binary filesystem image from this directory tree.
The genromfs utilitity is a prime example. Given a directory tree on a host
it will construct a file that is a ROMfs binary image. Similar tools exist
for many filesystems types (such as mkfs.jffs2 for JFFS, mkisofs for ISO9660).

6. Flash Tools

There are a number of tools available at the application level for use with
Flash Memory on uClinux. Some are specific to the underlying block device
driver in use.

When using the MTD drivers the primary tools available are:

erase — erase some flash segments
eraseall — erase all of flash device
lock — write lock flash segments
unlock — write unlock flash segments
mkfs.jffs — Construct a JFFS filesystem from a directory structure
mkfs.jffs2 — Construct a JFFS2 filesystem image from a directory structure

All of erase, eraseall, lock and unlock are used on the target device in
mtd partition devices. The mkfs.jffs and mkfs.jffs2 tools are generally used
on a host system to build filesystem images that can be loaded into Flash on
a target device. Since the MTD drivers provide standard char and block devices
on the target you can use system tools, like dd, to write contents into Flash
devices.

Netflash is a tool developed specifically for uClinux that is a nice way
to update either MTD or blkmem devices. It takes a file and programs it into
a Flash device (also doing the erase step for you). It can load files over the
network (via tftp, httpd or NFS) or even use local files to program Flash.

7. A Real Example

It is worthwhile looking at a real life example, just to demonstrate some
of the details we have looked at. The example system is based around a
ColdFire 5272 processor with 2Mb AMD Flash and 4Mb of SDRAM. (For reference
this is the Snap Gear LITE VPN router product).

The system is running a uClinux 2.4.x kernel, and we will use the MTD drivers
for Flash support. We won’t use a read/write filesystem, but instead use a
ROMfs filesystem type.

The AMD Flash selected is a “bottom boot” type. It has a number of small
erase segments at the bottom of the address space with sizes 16kb, 8kb,
8kb and 32kb. All the rest of segments are a uniform 64kb in size.

The Flash memory partition map will be:

SEGMENT SIZE MTD-DEVICE DESCRIPTION

0 16k mtd0 boot loader
1 8k mtd1 kernel boot arguments
2 8k mtd2 factory configuration information
3 32k mtd3 spare
4 64k mtd4 runtime non-volatile configuration
5
.
. 1984k mtd5 kernel + root filesystem
.
35

0-35 2048k mtd6 all of Flash memory

There are a few interesting things to look at here. Firstly notice how we try
to take advantage of the smaller segments at the bottom (some Flash memory
devices have a set of smaller segments at the top or bottom of the device).
Also you can see that it is possible to overlap areas of the Flash covered by
different mtd partitions (although you would want to be very careful when
doing this. In this example it is used to create a single mtd device that
allows reprogramming the entire Flash in one go).

Although not shown in this example, it is possible to order partitions in an
arbitrary fashion. It is not necessary to sequentially partition Flash segments.

The kernel is stored compressed. The boot loader sets up SDRAM and then
decompresses the kernel into it for execution. The root filesystem is stored
immediately after the compressed kernel image. It is used in place in the
Flash. Typically the root filesystem is around 1.5Mb in size, after removing
the space used by the compressed kernel image.

One distinct advantage of having the kernel and filesystem in a single
combined image is that a firmware upgrade is a one step reprogramming of
the/dev/mtd4 device. For many devices this single firmware upgrade is the
best. Version control is much simpler when there is an all-or-nothing upgrade.

The key to this setup is an MTD map driver. You can find the source for this
example at uClinux-2.4.x/drivers/mtd/maps/nettel-uc.c. It defines the mapping
layout above, and it also configures the root filesystem to be on device
mtd4 (it cleverly avoids the kernel binary to find the ROMfs filesystem.)

During boot the MTD drivers print some diagnostics to let you know what devices
it has found, and what the partition setup is. For example, in the boot log
we will see:

Snap Gear flash probe(0xf0000000,2097152,2): 200000 at f0000000
CFI: Found no Flash device at location zero
Found: Toshiba TC58FVB160
number of JEDEC chips: 1
Creating 7 MTD partitions on “Flash”:
0x00000000-0x00004000 : “Bootloader”
0x00004000-0x00006000 : “Bootargs”
0x00006000-0x00008000 : “MAC”
0x00010000-0x00020000 : “Config”
0x00008000-0x00010000 : “Spare”
0x00020000-0x00200000 : “Image”
0x00000000-0x00200000 : “Flash”

So the MTD drivers found a Toshiba Flash part, and divided it up into our
desired mapping. The MTD drivers have a debug verbosity level that you can
configure in the kernel configuration. Increasing this setting can provide a
lot more feedback about the probing process, and more detailed information
on the Flash devices found.

You can see from the initial probe message that the Flash memory device
is mapped into the CPU address space at address 0xf0000000. This address was
set in the mapping driver (nettel-uc.c).

In the field the kernel and filesystem image is updated using the netflash
utility. The command line is as simple as:

netflash imagez.bin

Where you supply the address of a local tftp server to fetch the image from.
Netflash does the rest, rebooting the unit when done (a necessary step when
your root filesystem is in-place in the Flash).

8. References

The following URL’s will be helpful:

www.uclinux.org

Home of uClinux

www.linux-mtd.infradead.org

Home of the Linux MTD drivers.

www.ucdot.org

uClinux news, and current events.uClinux news, and current events.