November 26, 2008

Run your NFS server in the user address space with NFS-GANESHA

Author: Ben Martin

NFS-GANESHA is an NFS version 2-4 server that runs in the user address space instead of as part of the operating system kernel. Filesystem in Userspace (FUSE) lets you run a filesystem in the user address space instead of as part of the Linux kernel, but the FUSE support in the Linux kernel from many Linux distributions does not allow you to export FUSE through NFS. NFS-GANESHA lets you expose FUSE through NFS without patching your kernel.

NFS-GANESHA accesses the underlying data through a File System Abstraction Layer (FSAL), allowing you to plug in your own storage mechanism and access it from any NFS client. NFS-GANESHA provides a FUSE-compatible FSAL to allow you to quickly access a FUSE filesystem over NFS while avoiding the need for data to bounce through the kernel FUSE mechanism on the NFS server.

Why would you want to do this? Running a filesystem out of the kernel through FUSE lets you use libraries to support your filesystem's functionality -- for example, you can use Berkeley DB to store the file contents -- and also aids greatly during the development of your filesystem because a bug in your code won't cause the kernel itself to oops. FUSE includes a portion that runs inside the Linux kernel, which allows applications to use FUSE filesystems just like any other "regular" kernel filesystem. The Linux kernel FUSE code communicates with the user address space FUSE filesystems on behalf of the application whenever you use a FUSE filesystem. One downside of all of this is that some FUSE filesystems don't like being exported over NFS.

Consider what happens when an NFS client wants to access a file on a FUSE filesystem when using the Linux kernel NFS server implementation. The NFS client issues the read to the NFS server. The server notices that this is a FUSE filesystem and issues the read to the user address space FUSE filesystem. The FUSE filesystem replies and the data is copied back into the kernel for the NFS server to send back to the client.

With NFS-GANESHA, the NFS client talks to the NFS-GANESHA server instead, which is in the user address space already. NFS-GANESHA can access the FUSE filesystem directly through its FSAL without copying any data to or from the kernel, thus potentially improving response times. Of course the network streams themselves (TCP/UDP) will still be handled by the Linux kernel when using NFS-GANESHA.

NFS-GANESHA is not in the Fedora, openSUSE, or Ubuntu repositories. The download page includes both source tarballs and binary RPM files. I'll install from source using version 0.99.45 of nfs-ganesha on a 64-bit Fedora 9 machine. The only unusual point during installation is that you must select which FSAL you want in your build. The default FSAL is proxy, which allows NFS-GANESHA to run as a proxy to another NFS server. The two most interesting options for --with-fsal are POSIX, to export a normal kernel filesystem, and FUSE, to export a slightly modified FUSE filesystem through NFS. Unfortunately you can't choose more than one FSAL, so you'll have to configure, build, and install multiple times if you want to use both the POSIX and FUSE FSALs. Other configure options allow you to turn on additional debugging information for your build.

$ tar xjf /FromWeb/nfs-ganesha-0.99.45.tar.bz2
$ cd ./nfs-ganesha-*
$ ./configure --with-fsal=FUSE
$ make
$ sudo make install

To test NFS-GANESHA as an NFS server exporting a FUSE filesystem, I used the compFUSEd FUSE filesystem, which transparently (de)compresses any data written to it using one of many compression algorithms. I used the installation, setup, and configuration options I talked about in my previous article about compFUSEd.

You have to slightly modify and then rebuild a FUSE filesystem in order to use it through the NFS-GANESHA FSAL. Basically, you include ganesha_fuse_wrap.h instead of fuse.h and link to an NFS-GANESHA library instead of the FUSE library.

The first half of the installation commands shown below are like those in the previous article. I then modify cf_main.c to include the ganesha_fuse_wrap.h header instead of fuse.h and Rules.make to link with the ganeshaNFS library instead of the FUSE library.

$ cd ~
$ mkdir nfs-ganesha-fuse
$ cd ./nfs-ganesha-fuse
$ tar xvf /.../cf-GISMO-200712321.tgz
$ cd ./CompFused/Gismo/
$ vi Makefile
# Set to 1 to include support
$ vi Rules.make

LIBS= -lfuse

$ vi cf_main.c

//#include <fuse.h>
#include <ganesha_fuse_wrap.h>

ret = fuse_main(argc, argv, &cf_oper, 0);
$ make
/tmp/ccalKr2f.o: In function `main':
/home/ben/nfs-ganesha-fuse/CompFused/Gismo/cf_main.c:1226: undefined reference to `ganefuse_main'

$ vi Rules.make

#LIBS= -lfuse
LIBS= -lganeshaNFS


$ make

The below commands start NFS-GANESHA, exporting a compFUSEd FUSE filesystem as an NFS server. The configuration of compFUSEd is shown first. I used a zlib compressed filesystem contained in the ~/.compFUSEd_test.backend directory. To make sure that the Linux kernel is not trying to run an NFS server, I stop the standard server before running compFUSEd.

$ vi /home/ben/.compFUSEd
backend = /home/ben/.compFUSEd_test.backend
compression = /usr/local/lib/compFUSEd/plugins/
writer = /usr/local/lib/compFUSEd/plugins/
chunk_max = 100 # Up to 100 chunk of 8K open per file
chunk_size = 8192 # That's 8K per chunk (uncompressed)
exclude = gz # On this mount we compress everything except .gz files
$ sudo service nfs stop
$ compFUSEd ~/compFUSEd_test

Because the build of compFUSEd is modified to use NFS-GANESHA, when you run compFUSEd above you will actually be starting an NFS server through NFS-GANESHA. You can then use the commands below to mount the NFS server on /mnt/t and interact with it just like any other NFS filesystem. The date file is 29 bytes when listed in /mnt/t and 20 bytes when you list ~/.compFUSEd_test.backend, which verifies that the compFUSEd filesystem is working. The final command shows that the test file is in fact not just a plain text file but is compressed.

$ sudo mount -o vers=3,udp localhost:/ /mnt/t
$ date >> /mnt/t/test1.txt
$ cat /mnt/t/test1.txt
Tue Nov 18 15:05:37 EST 2008
$ ls -l /mnt/t/test1.txt
-rw-rw-r-- 1 ben ben 29 2008-11-18 15:05 /mnt/t/test1.txt
$ l ~/.compFUSEd_test.backend/
-rw-rw-r-- 1 ben ben 20 2008-11-18 15:05 test1.txt
$ od ~/.compFUSEd_test.backend/test1.txt
0000000 001001 000000 020000 000000 000000 000000 000000 000000
0000020 000035 000000

The above use of FUSE with NFS-GANESHA allows only a single FUSE filesystem to be exported, and it will be made available at the root directory of the NFS server. By using the -f option when running an NFS-GANESHA FUSE filesystem, you can specify a configuration file for NFS-GANESHA to read that contains a collection of key-value pairs inside bracketed scopes that control where the FUSE filesystem is exported, among other things. The Export scope has a key Tag that lets you select the path that NFS-GANESHA will export the filesystem at. For example, a configuration file with Export { Path = "/foo" }, when passed to the FUSE application using -f, will cause the FUSE filesystem to be available at localhost:/foo. This lets you have multiple FUSE filesystems exported through the local NFS-GANESHA server.

Unfortunately, the above -f configuration setting doesn't work with compFUSEd, because both NFS-GANESHA and compFUSEd use Bison to parse files, and both projects include a yyparse C function. With two yyparse functions at global scope, NFS-GANESHA calls the one in compFUSEd instead of its own, resulting in "interesting" error messages. Hopefully this scoping issue will be resolved in an upcoming NFS-GANESHA release. The developers might consider changing the command-line options that the NFS-GANESHA FUSE code accepts and prefix them as long options like --ganeshafuse-foo to avoid potential conflicts at the command-line level too.

If you don't want to patch your Linux kernel FUSE implementation to allow NFS exporting of FUSE filesystems, NFS-GANESHA might be just what you are after. If you have a network card that allows packet data to be shunted into the user address space efficiently, running your FUSE filesystems with NFS-GANESHA might give you a nice performance boost.


  • System Administration