July 1, 2008

Using Bonnie++ for filesystem performance benchmarking

Author: Ben Martin

Bonnie++ allows you to benchmark how your filesystems perform various tasks, which makes it a valuable tool when you are making changes to how your RAID is set up, how your filesystems are created, or how your network filesystems perform.

Bonnie++ is available for openSUSE 10.3 as a 1-Click, for Ubuntu Hardy, and in the standard Fedora 9 repositories. I installed Bonnie++ from the 64-bit Fedora 9 repositories.

The packages for Ubuntu and Fedora both install Bonnie++ into /usr/sbin, while openSUSE installs into /usr/bin. Bonnie++ will complain and fail to work if invoked as the root user, but if Bonnie++ is installed into /usr/sbin instead of /usr/bin, to invoke Bonnie++ as a regular user you will probably have to include its full path. Bonnie++ uses autoconf to generate its Makefile, and the install-bin target is hardwired to install Bonnie++ into sbin, so to package it in /usr/bin you have to move it there after installation even if building from source.

Bonnie++ benchmarks three things: data read and write speed, number of seeks that can be performed per second, and number of file metadata operations that can be performed per second. Metadata operations include file creation and deletion as well as getting metadata such as the file size or owner (the result of a fstat(2) call).

139742-1-thumb.png

If you are deciding what filesystem to use for /tmp, then the metadata throughput might be the most important benchmark. On a filesystem where you expand 20MB tarballs, the write performance may be less important than the number of files you can create per second. The file metadata benchmarks are also important if you are planning to run a Squid web proxy or a mail server using the maildir format to store each email in a single file. Such applications perform many file metadata-intensive operations, and often each of the individual files are fairly small, meaning that bulk transfer speeds do not play as much of a role as metadata updates.

The metadata benchmarks are also important if you are creating a new filesystem on top of a RAID device. Journaling filesystems can use write barriers to protect their journaled metadata. If you are using a hardware card to provide RAID functionality, these barriers might force the entire cache on the RAID card and all disks assembled into the RAID on that card to be synced, which can lead to extremely poor performance. As a real-world example, an Adaptec 31205 12-port card with a RAID-6 and XFS using barriers can support less than 100 file creates per second in tests I recently performed. Explicitly disabling barriers in XFS when mounting the same filesystem gives closer to 6,000 file creates per second. Though I'm not advocating disabling barriers in XFS, in this particular hardware configuration it could be done without data loss risk.

The number of seeks per second should be fairly bound by your hardware. If you test a single disk and then a RAID, you should expect that a filesystem on the RAID will give an increase in the number of seeks in proportion to how the RAID is configured. For example, I configured a single disk volume, a RAID-0 stripe set with two disks and with six disks. The single disk could get about 200 seeks per second, the two-disk stripe could perform 340 seeks per second, while and the six-disk could perform 533 seeks per second.

The raw data read and write benchmarks include both a per-character and per-block figure. The former uses standard library calls that perform the read or write operations a single character at a time, while the later perform function calls to transfer larger blocks at once.

The rewrite test is important if you are running applications that modify data in place, and doubly important if you are running such applications on a parity RAID (such as RAID-5 and RAID-6). The rewrite test reads a block of data, changes it slightly, and writes it back. The blocks are cited to be BUFSIZ large, which on my 64-bit Fedora 9 installation is 8,192 bytes.

To see why the rewrite test is important on a parity RAID, imagine that you are creating a RAID-5 using Linux software RAID on four disks. The RAID will be created by default with a 64 kilobyte (KB) chunk size, which means that over the four disks there will be three chunks of 64KB and one 64KB chunk being the parity, as shown in the diagram above, where disks 1-3 are used for data and disk 4 is storing the parity. Of course the disk that stores the parity will change depending on which band in the diagram you are accessing. If you modify only 8KB as the rewrite test does, shown as the red rectangle in the figure, then you are forcing the 64KB parity chunk to be recalculated again for this modification and the original data chunk of 64KB to be written along with your 64KB parity chunk. In the case of the diagram, both the dark gray band on disk 1 as well as the pink parity band on disk 4 will have to be written after the parity is recalculated.

You might see some of the metadata benchmarks reported by Bonnie++ as +++++ instead of a real number per second. This happens when that particular benchmark completes too quickly. To overcome this for benchmarking a particular setup, use the -n option to specify that more files should be used for the metadata tests. With the -n option you can specify up to four parameters. The first is the number of files to create per directory, specified in multiples of 1,024; the second two numbers are the maximum and minimum size of each file used for testing; and the last is how many directories to create, each containing the number of files you nominated with the first parameter. The defaults are to create 16,384 files with a size of 0 bytes in a single directory, which is equivalent to using -n 16:0:0:1 as a parameter to Bonnie++.

You can process the comma-separated output at the bottom of a Bonnie++ run with the bon_csv2html command to format your benchmark results for presentation on the Web. You might also like to use the -q option to Bonnie++ to redirect everything but the comma-separated data to stderr so that you can pipe the stdout of Bonnie++ directly into bon_csv2html to generate HTML output.

By default the name of the machine you're testing is reported for the benchmark run. You can override this with the -m option to record not only the machine name but also information about the filesystem configuration itself.

As each test is performed Bonnie++ prints a fresh line informing you which particular test it is up to in the benchmark process. The results are shown at the end in both a text table and as a comma-separated list. The results shown below required the number of files for metadata testing be increased using -n 256 so that the read metadata could be reported by Bonnie++. If I didn't supply a -n 256 parameter, both the sequential create read and random create read operations could be performed too quickly, and thus Bonnie++ would only report
those numbers as "+++++".

$ rm -rf /tmp/foo
$ mkdir /tmp/foo
$ /usr/sbin/bonnie++ -d /tmp/foo
Writing with putc()...done
Writing intelligently...done
Rewriting...done
Reading with getc()...done
Reading intelligently...done
start 'em...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.

Version 1.03 ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
chunklog 4G 59330 97 269236 64 109173 31 54711 97 290233 38 509.6 1
------Sequential Create------ --------Random Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
256 3862 96 342257 98 10759 69 3704 94 339523 98 1771 11
chunklog,4G,59330,97,269236,64,109173,31,54711,97,290233,38,509.6,1,256,3862,96,342257,98,10759,69,3704,94,339523,98,1771,11

As most applications that perform heavy IO will not read or write data in single characters, the block read, write, and rewrite figures are the most interesting data transfer figures. The %CP column reports the percentage of the CPU that was used to perform the IO for each test. The file metadata tests are shown in the second row of results; in them, files with a zero byte size are created, read (with stat(2)), and finally deleted. The create, read, delete metadata tests are performed using file names that are sorted numerically and those are just random numbers. Some filesystems perform much better if an application creates and accesses files in a specific order. Because Bonnie++ performs the metadata tests twice, you can see whether a filesystem has optimized accesses to files by performing accesses in sorted file name order.

The final line printed by Bonnie++ is the same data that the table contains, formatted as comma-separated values. The Bonnie++ distribution includes the bon_csv2html Perl script, which takes the comma-separated values reported by Bonnie++ and generates an HTML page displaying them. The below table was generated by piping the comma-separated value line from the above test into bon_csv2html.

Sequential Output Sequential Input Random
Seeks
Sequential Create Random Create
Size:Chunk Size Per Char Block Rewrite Per Char Block Num Files Create Read Delete Create Read Delete
K/sec % CPU K/sec % CPU K/sec % CPU K/sec % CPU K/sec % CPU / sec % CPU / sec % CPU / sec % CPU / sec % CPU / sec % CPU / sec % CPU / sec % CPU
chunklog 4G 59330 97 269236 64 109173 31 54711 97 290233 38 509.6 1 256 3862 96 342257 98 10759 69 3704 94 339523 98 1771 11

The Bonnie++ benchmark is a great yardstick to see if you are getting the performance from your hardware that you think you should. You can search around for Bonnie++ results that other people have produced and published using similar hardware. If you are making changes to how your RAID or filesystem is created, Bonnie++ is invaluable for testing whether the changes you think should improve performance actually have a noticeable and positive effect.

Tomorrow, I'll show you how to take the results of multiple Bonnie++ runs and generate a graph showing the relative changes between your benchmarks, so you can instantly see whether your modifications are positive and by how much.

Category:

  • System Administration