June 8, 2005

The GNU Compiler for Java comes of age

Author: Bruce Byfield

The GNU Compiler for Java (GCJ), a free software implementation of Java, has been in development for seven years, but with the Free Software Foundation's recent call for volunteers, the project is suddenly receiving more attention than ever before. For many, GCJ is seen as a means of ensuring that the next version of OpenOffice.org does not require
non-free versions of Java for full functionality. Yet the scope of the project goes far beyond this immediate need.

Part of the larger GNU Compiler Collection (GCC), GCJ consists of three parts: gcj, a compiler that converts Java code into machine language; libgcj, a collection of standard Java class libraries; and GIJ (GNU Interpreter for Java), a Java virtual machine (JVM).

Over the years, the project has survived a change in sponsoring companies and changing levels of interest in it. Today implementations of GCJ are just starting to become available. GCJ is partly implemented in several distributions already. Debian, for example, includes pdftk, a command-line tool for manipulating PDF files, in its testing and unstable distributions. However, aside from pdftk's dependency on libgcj, nothing marks it as a Java-based program; it simply works the way any compiled program would do. The same is true of gcjwebplugin, a plugin for Web browsers in Debian unstable that uses the gij JVM. However, gcjwebplugin does not include a security manager for Java applets, and should therefore be used with caution. Moreover, both these packages rely on GCJ 3.4, since version 4.0 has yet to find its way into Debian.

A better place to see GCJ in action is Fedora Core 4. Currently in its test 3 release, Fedora Core 4 is intended to showcase GCJ 4.0's capabilities.

Project history

Per Bothner started GCJ in 1998 at Cygnus Solutions. Cygnus employees were already contributors to GCC, so the project was a natural extension of the company's interests. In addition to Bothner, original contributors to the project included Tom Tromey, Anthony Green, Warren Levy, and Alexandre Petit-Bianco. Many remain involved in the project today.

In addition to the general advantages of a free Java, Cygnus was also interested in creating a version of Java for embedded systems. A surviving white paper suggests that GCJ would offer better optimization and faster startup times than Sun Java. Although benchmarks are limited in their usefulness because of differences in the implementation of Java, testing of older versions suggests that these are goals that GCJ has yet to achieve, especially with GIJ, possibly because the project remains focused on basic functionality.

When Red Hat purchased Cygnus in 1999, the original GCJ team members became Red Hat employees. However, Red Hat's interest in the project waned. "There have been periods," according to Tromey, "Where all the work on it was done on a volunteer basis."

Red Hat began to fast-track GCJ again several years ago, working to get Eclipse, the extensible tool platform, to run with it. When Red Hat created the Fedora Project, with its goal of developing a distribution "in line with the ideals of free software," it again emphasized GCJ. Early in 2004, against the background of a growing demand for a free software licence for Java and Richard Stallman's warning about "the Java trap" -- that is, software that, while free in itself, depends on non-free Java --
Red Hat stepped up its efforts, adding support for AWT and Swing, two cross-platform sets of class libraries for GUI development.

Today, Red Hat employs four full-time employees to work on GCJ. Red Hat's Eclipse team helps test GCH, while Caolan McNamara, another Red Hat employee, is currently finalizing his efforts to make GCJ work with the OpenOffice.org version 2.0 beta.

There's also a large GCJ community outside of Red Hat. The project interacts closely with other free Java implementations, such as kaffe and the recently announced Project Harmony. GCJ project members are also in contact with the communities working on various Java applications, and with Debian and Ubuntu, two GNU/Linux distributions that have expressed strong interest in shipping GCJ.

The GCJ works closely with the GNU Classpath community, which, like GCJ, is developing free versions of Java's class libraries. The interaction is so close that GCJ and Classpath are in the process of merging their code bases. "We thought it wasteful to have two different GNU implementations of the class library," Tromey says. The process, he says, is "more than 90 percent" complete, although he expects that one or two differences will always remain. Presently, the two development trees are merged manually, but the projects are considering eliminating this effort by having libgcj developers routinely contribute to the Classpath tree as well.

"We are trying hard to expand who we talk to," Tromey says. "We've finally realized that building bridges with the rest of the free Java community is vitally important." The project is also trying to increase communication with members of the Java community in the hopes of encouraging its members to develop with the free implementations of Java in mind.

"In GCC 4.0," says Tom Tromey, one of the original GCJ developers and now a Red Hat employee, "we added a new 'Binary Compatibility ABI' to GCJ. This is a different way of compiling code that lets us comply with the Java language's binary compatibility rules. It turns out that this was the key missing piece to let us run a wide range of real Java applications with no application-level changes."

In addition, GCJ itself now compiles a Java application's.jar files to shared libraries, then registers them to a master database for better performance.

These changes have helped to add a flood of new applications that run under GCJ. In the list of packages for a custom installation, 108 packages for Java development are listed, including ones for AWT and Swing; Apache Ant, a build tool intended as an alternative for make; jakarta-commons, a set of resuable Java components; and xerces, a Java parser for XML. Also included in Fedora Core 4 is Eclipse, one of the first programs Red Hat got working with GCJ. Although no formal benchmarks are available for version 4.0, I installed it and found that, subjectively, Eclipse still seems to take more time to start or to open new windows than Sun Java does on a low-end machine. However, performance for all these development tools seems acceptable with 512MB of RAM or greater.

Another Java-dependent program in Fedora Core 4 is the OpenOffice.org version 2.0 beta. Given the Free Software's call for volunteers, this implementation may be of special interest to many. Unfortunately, though, it is still incomplete in the version we examined.

A look under Tools > Options > Java in the OpenOffice.org build in Fedora Core 4 shows it successfully detects GIJ. This small step is apparently enough to eliminate the annoying loop of error dialogs that pop up when users attempt to run a macro without having Java enabled. However, the Java-dependent tools in OpenOffice.org still require some attention. The basic document wizards do not run, and, while a new database can be created using the new Base application, an error message appears, and the tools for tables, queues, forms, and reports do not work. Nor does the mail merge tool recognize any of the Java mail tools available in Fedora Core 4, a problem that makes email merges impossible.

Judging from the implementation in Fedora Core 4, GCJ has reached the stage in which it can be used for development work. However, speed may be an issue on older machines, and work is still required to make GCJ a complete substitute for Sun Java.

Next steps

Tromey seems to agree. "We still have a catch-up period ahead of us," he says. "Java keeps evolving, and since we're not involved with the Java Community Process (and perhaps even if we were) there is a lag between a new official release and our reimplementation of it." For example, GCJ still lacks complete class libraries for Java version 1.4, while work on 1.5 compatibility has already begun. Under these conditions, the project has little time for work such as improving performance or documentation.

Tromey would also like to see better quality assurance and to make the project's processes more open. He also talks about improving cooperation with other free Java projects. "One promising idea," he says, "is having a single core VM with different execution engines, something we've called 'pluggable Just-in-Time Interpreters.'"

Conceivably, the FSF's call for volunteers may suddenly place all these goals within reach. Yet, for now, considerable work remains to be done.

Tromey says, "I often think about our endgame. What will it mean when GNU Classpath is complete? How can we be better than the proprietary JVMs but still be compatible?" If the answers to these questions are less distant than they were seven years ago at the project's start, in some ways, they remain as tantalizingly remote as ever.

Bruce Byfield is a freelance course designer and instructor and a technical journalist. He is a regular contributor to NewsForge, ITMJ, and Linux.com.

Click Here!