Using Gnulib to improve software portability

1032

Author: Diego 'Flameeyes' Pettenò

Many, if not most, free and open source software projects are developed primarily on Linux-based systems using the GNU C Library (glibc). Projects that use glibc are likely to depend on functions that are not available on systems that use different C libraries, such as the different BSD flavors. When packages are built on systems that don’t use glibc they often fail, because the other C libraries are missing functions found in glibc. The GNU Portability Library can help developers with cross-platform programming needs.

In the past there were many different libraries, such as publib, that tried to provide alternatives to the functions that are missing in the main C library. Unfortunately, handling compatibility libraries proved to be difficult. The additional libraries would require additional tests when running configuration scripts prior to compilation, and add dependencies for non-glibc systems.

As the number of new functions provided by the glibc increased, the GNU project started looking at the requirements for portability of programs on operating systems based on different libraries, and eventually created the GNU Portability Library (Gnulib) project.

Normally, a library is code that is compiled as a shared or static file, and then linked into the final executable. Gnulib is a source code library, more similar to a student’s collection of notes than a usual compiled library. Gnulib can’t be compiled separately; the code in it is intended to be copied into the projects using it.

Using Gnulib

Two requirements limit Gnulib use, one technical and the other legal. First, the software you use it with must also use GNU Autotools, as Gnulib provides tests for replacements of functions and headers written in the M4 language, ready for usage with GNU Autoconf.

Second, it has to be licensed under the GNU General Public License (GPL) or the GNU Lesser General Public License (LGPL), as the code inside Gnulib is mostly (but not entirely) released under the GPL itself. Some of it is also released under the LGPL, and some of it is available as public domain software.

If you’re working on software released under other licenses, such as the BSD license or Apple Public Source License (APSL), it’s better to avoid the use of functions that are not available in a library licensed with more open terms. For example, you could take the libraries present in a BSD-licensed C library to replace missing functions in the current library, whichever license the software is using. Alternative, you could find replacement functions in other BSD-licensed software or create a “cleanroom” implementation without copying code from GPLed software, leaving the external interface the same but using different code.

It’s usually easy to re-implement functions or just copy missing functions from another project when they are not available through another C library, especially when they are simple functions that consist of less than 10 lines of code. Unfortunately, many projects depend firmly on GNU extensions, and won’t build with replacement functions, or the code is already so complex that adding cases to maintain manually is an extra encumbrance for developers.

What Gnulib provides is not only the source code of the missing functions, but an entire framework to allow a project to depend on GNU extensions, while retaining portability with non-GNU based systems.

The core of the framework is the gnulib-tool script, which is the automated tool for extracting and manipulating source code from Gnulib. Using gnulib-tool, you can see the list of available modules (gnulib-tool --list) or test them (one by one, or all together, using the --test or --megatest options), but more importantly you can automatically maintain the replacement functions for a source tree.

A practical example should help explain the concept. Let’s say that there’s a foofreak package that uses the strndup() function (not available on BSD systems for instance) and the timegm() functions (not available on Solaris). To make the source portable, a developer can run gnulib-tool --import strndup timegm from the same directory of the source code, and the script will copy (in the default directories) the source code and the M4 autoconf tests for strndup(), timegm(), and their dependencies — for example, strndup() depends on strnlen().

After running, gnulib-tool tells you to make a few changes to your code to allow the replacement to be checked and used when needed. It requires the Makefile in lib/ to be generated from the configure script, so it has to be added to AC_OUTPUT or AC_CONFIG_FILES. At the same time the lib/ subdirectory has to be added to the SUBDIRS variable in the Makefile.am. The M4 tests are not shipped with other packages, so they must be copied in the m4/ directory, and that has to be added as an include directory for aclocal. Finally, two macros have to be called within the configure.ac to initialize the checks (gl_EARLY and gl_INIT).

You can specify the name of the subdirectories and the prefix of the macros by running gnulib-tool with the parameters --source-base, --m4-base, and --macro-prefix, respectively. It’s also important to note that the replacement functions are built in an auxiliary library called libgnu (by default, but the name can be overridden by using the --lib parameter), so the part of the software using those functions has to be linked against this too.

If later on your project also wants to use the iconv() function, gnulib-tool can detect the currently imported modules and add the required iconv module without rewriting everything from scratch. This makes it simple to add new modules when you use new functions.

The different replacement functions are called “modules” by gnulib-tool, and they consist of some source code, some header files, and an M4 macro test. As some functions depend on the behavior of other functions, the modules depend one on the other, so adding a single module can generate quite a few additional checks and replacements, which make sure that the behavior is safe.

As some modules are licensed under the GPL, while other are licensed under the LGPL, a package licensed under the latter might want to make sure that no GPL modules are pulled in, as that would break the license. To avoid adding GPLed modules, you can use gnulib-tool’s --lgpl option, which forces the use of LGPL modules.

You can also use alternative code to provide a replacement function instead of using the Gnulib modules, and to avoid problems with dependencies. Gnulib-tool has an --avoid option that prevents specified Gnulib modules from being pulled in.

Following the previous example, if foofreak already contains a strnlen() function, used when the system library doesn’t provide one, it would be possible to use that, instead of importing the strnlen() module from Gnulib, by issuing the command gnulib-tool --import strndup timegm --avoid strnlen. With this syntax the strnlen module will be ignored and the function already present in foofreak will be used. While this option is provided, it’s usually not advisable to use it if you don’t really know what you’re doing. A better alternative would be dropping strnlen() from the code where it was used, and using the replacement provided by Gnulib instead.

Summary

Gnulib is an interesting tool for people working with GPL- or LGPL-licensed software that needs to be portable without dropping the use of GNU extensions, but it has some drawbacks. The major drawback is the license restrictions, which requires non-(L)GPL-licensed software to look elsewhere for replacements. It also requires the use of the GNU toolchain with Autotools, as it would be quite difficult to mimic the same tests with something like SCons or Jam.

Finally, the source code sharing between projects breaks one of the basic advantages in the use of libraries: the reuse of the same machine code. When the same function, required by 10 or 20 programs, has to be built inside the executable itself as the system does not provide it, there will be 10 or 20 copies of the same code in memory and on disk, and they may behave in different ways, leading to problems if they are linked inside a library used by third-party software.

Gnulib is worth a try, but you should not use it in critical software or software that might have a limited audience. In those situations, avoid the use of extension functions when possible, and add replacement functions only when they’re actually needed. There’s no point in having a replacement function for something that is works on 90% of modern systems and breaks only on obsolete or obscure operating systems or C libraries, especially if the software is written to be run on modern machines.

Category:

  • C/C++