NFS version 4, published in April 2003, introduced stateful client-server interaction and "file delegation," which allows a client to gain temporary exclusive access to a file on a server. NFSv4 brings security improvements such as RPCSEC_GSS, the ability to send multiple operations to the server at once, new file attributes, replication, client side caching, and improved file locking. Although there are a number of improvements in NFSv4 over previous versions, this article investigates just one of them -- performance.
One issue with migrating to NFSv4 is that all of the filesystems you export have to be located under a single top-level exported directory. This means you have to change your /etc/exports file and also use Linux bind mounts to mount the filesystems you wish to export under your single top-level NFSv4 exported directory. Because the manner in which filesystems are exported in NFSv4 requires fairly large changes to system configuration, many folks might not have upgraded from NFSv3. This administration work is covered in otherarticles. This article provides performance benchmarks of NFSv3 against NFSv4 so you can get an idea of whether your network filesystem performance will be better after the migration.
I ran these performance tests using an Intel Q6600-based server with 8GB of RAM. The client was an AMD X2 with 2GB of RAM. Both machines were using Intel gigabit PCIe EXPI9300PT NICs, and the network between the two machines had virtually zero traffic on it for the duration of the benchmarks. The NICs provide a very low latency network, as described in a past article. While testing performance for this article I ran each benchmark multiple times to ensure performance was repeatable. The difference in RAM between the two machines changes how Bonnie++ is run by default. On the server, I ran the test using 16GB files, and on the client, 4GB files. Both machines were running 64-bit Fedora 9.
The filesystem exported from the server was an ext3 filesystem created on a RAID-5 over three 500GB hard disks. The exported filesystem was 60GB in size. The stripe_cache_size was 16384, meaning that for a three-disk RAID array, 192MB of RAM was used to cache pages at the RAID level. Default cache sizes for distributions might be in the 3-4MB range for the same RAID array. Using a larger cache directly improves write performance of the RAID. I also ran benchmarks locally on the server without using NFS to get an idea of the theoretical maximum performance NFS could achieve.
Some readers may point out that RAID-5 is not a desirable configuration, and certainly running it on only three disks is not a typical configuration. However, the relative performance of NFSv3 to NFSv4 is our main point of interest. I used a three disk RAID-5 because it had a filesystem that could be recreated for the benchmark. Recreation of the filesystem removes factors such as file fragmentation that can adversely effect performance.
I tested NFSv3 with and without the async option. The async option allows the NFS server to respond to a write request before it is actually on disk. The NFS protocol normally requires the server to ensure data has been writen to storage successfully before replying to the client. Depending on your needs, you might be running mounts with the async option on some filesystems for the performance improvement it offers, though you should be aware of what async implies for data integrity, in particular, potential undetectable data loss if the NFS server crashes.
The table below shows the Bonnie++ input, output, and seek benchmarks for the various NFS version 3 and 4 mounted filesystems as well as the benchmark that was run on the server. As expected, the reading performance is almost identical whether or not you are using the async option. You can perform more than five times the number of "seeks" over NFS when using the async option, presumably because the server can avoid actually performing some of them because a subsequent seek is issued before the initial seek was completed. Unfortunately the block sequential output for NFSv4 is not any better than for NFSv3. Without using the async option, output was about 50Mbps, whereas the local filesystem was capable of performing at 91Mbps. When using the async option, sequential block output came much closer to local disk speeds over the NFS mount.
|Configuration||Sequential Output||Sequential Input||Random|
|Per Char||Block||Rewrite||Per Char||Block||Seeks|
|K/sec||% CPU||K/sec||% CPU||K/sec||% CPU||K/sec||% CPU||K/sec||% CPU||/sec||% CPU|
The table below shows the Bonnie++ benchmarks for file creation, read, and deletion. Notice that the async option has a tremendous impact on file creation and deletion.
|Configuration||Sequential Create||Random Create|
|/sec||% CPU||/sec||% CPU||/sec||% CPU||/sec||% CPU||/sec||% CPU||/sec||% CPU|
To test more day-to-day performance I extracted the linux-22.214.171.124.tar uncompressed Linux kernel source tarball and then deleted the extracted sources. Note that the original source tarball was not compressed in order to ensure that the CPU of the client was not slowing down extraction.
|Configuration||Find (m:ss)||Remove (m:ss)|
These tests show no clear performance advantage to moving from NFSv3 to NFSv4.
NFSv4 file creation is actually about half the speed of file creation over NFSv3, but NFSv4 can delete files quicker than NFSv3. By far the largest speed gains come from running with the async option on, though using this can lead to issues if the NFS server crashes or is rebooted.