April 12, 2006

Package management meets version control in rPath

Author: Bruce Byfield

rPath is a young company that is rapidly becoming a leader in package management innovation. At a time when traditional package management systems such as APT and dpkg or Yum and RPM are adding elements such as signed packages and plugins, and projects such as Autopackage and Zero Install are focusing on easy-to-use interfaces and giving ordinary users the ability to install desktop applications, rPath takes a top-down approach and focuses on simplifying release management.

rPath's goal, according to a white paper on the company Web site, is "a source control system married to a package system." To achieve this goal, rPath has developed three closely related projects: Conary, a package management system; rPath Linux; and rBuilder, a tool for working with Conary repositories. With these projects, rPath claims to be able to drastically reduce the time required to build a Linux release.

According to Erik Troan, rPath's founder and CTO, the company's development efforts began with the observation that the business of putting together a distribution was an anomaly in the world of free and open source software (FOSS). "Everywhere else you work in the open source community, and it's very collaborative," he says. "But then all of that, across hundreds of projects, gets stripped down to 15 people when it comes time to do a distribution."

The release group not only makes decisions about how to assemble packages into a distribution, but also issues security updates throughout the lifecycle of the release. In addition, in the case of commercial distributions, the team often develops patches for specific customers that are not always merged into the main build. Often, the release team duplicates work that teams with other distributions are doing, yet distributions have become so specialized that one team may not be able to borrow what another is doing.

This traditional approach, although widely used, is both time-consuming and inefficient, Troan suggests. Building a new release for a distribution, according to rPath's studies, takes two to three work months. "That's stripping things out, modifying the install process in your applications, ripping more stuff out, [and] rebuilding things that break some dependencies," Troan explains. "When we come in and do it with our tools, it's a few days. It's really dramatic."

rPath recycles Red Hat veterans

rPath includes many Red Hat alumni in key positions. It was founded by three former Red Hat employees, all of whom played major roles in the original development of Red Hat's core technologies: Erik Troan, who oversaw the releases of Red Hat 4.0 through 7.2 and co-wrote RPM with Marc Ewing; Michael K. Johnson, who led Red Hat's kernel engineering team and was the first leader of the Fedora project; and Matt Wilson, the main programmer for the Anaconda installer, who also led Red Hat's certification program with the American federal government.

The three began work on Conary and related programs in early 2004. After working with a team of five or six programmers for about a year and a half, the team brought in Billy Marshall, the former vice president of North American sales at Red Hat, as CEO. Shortly afterwards, rPath raised $6.4 million in venture capital through North Bridge Venture Partners and General Catalyst Partners.

Since then, the company has grown to 26 people, including 16 engineers. Among its employees are Cristian Gafton, the architect of Red Hat Network and former head of Red Hat OS engineering; Nathan Thomas, former chief architect and director of sales engineering for Red Hat; Marty Wesley, former director of Red Hat product management; and Justin Forbes, who was on the team that ported Linux to the AMD64 platform and is a member of the stable kernel review team.

rPath's business model centers on developing and maintaining custom builds for clients using the main version of rBuilder, and selling support and other services.

The Conary package system

rPath's streamlining of package management begins with Conary, a new package management system. The unique feature of Conary is that it is supported by a structured database repository that operates as a source control system, both in the online repository and the local record of installed packages, complete with branches and the ability to merge changes as needed. The structure is primarily a benefit for release managers, although it can potentially benefit end users by making packages easier to track and upgrade.

This source control system is designed to be more efficient than the major package systems. Along with the highly organized structure that comes with versioning control, some of Conary's efficiency comes from a precise set of naming conventions for the content. Where Debian or RPM packages are marked by a version number that identifies only their age relative to other versions of the package, all packages and files in a Conary repository are named using a long, unique string that may include information such as its location in the repository, the upstream version number, the source revision, the number of the binary build, and the configuration for a specific architecture.

This use of unique identifiers gives Conary several advantages. By having each piece of content uniquely named, maintainers do not waste space in the repository with unnecessary, potentially confusing duplications. Similarly, users can tell at a glance from which sources binary packages are compiled. Moreover, builds for different architectures and configurations can be included in the same repository and draw on the same files as much as possible.

The source control system gains further efficiency by having packages reference files or groups of files called "components," rather than containing them, and by using changesets, Conary's version of patches, to increase the speed of updates.

From a system administrator's perspective, Conary is a command-line tool similar to apt-get or yum. For instance, the command to install or update a package is conary update packagename. In addition, Conary includes utilities for checking the status of packages on either the local system or in the source control repository, and for verifying updates.

Like most package systems, Conary can include the automatic resolution of dependencies and password authentication for its operations. However, in many ways, Conary is more versatile than most of its predecessors. Users have the option of employing changesets to update the local system, and can roll back changes to the system one at a time to return it to a previous state. Similarly, sections of repositories and individual files on the local system can be "pinned," preventing their removal and allowing multiple versions of the same package to co-exist -- a concept not to be confused with pinning with apt-get, which is a means of prioritizing which repositories to use.

However, as Troan acknowledges, "Really, Conary is a core technology that allows distributions to be put together" rather than a major innovation for daily use. Conary does not even have a graphical interface, although one is in development.

Conary is released under the Common Public License (CPL).

These are only some of the highlights of Conary. Those interested in more details should look at the documentation on the Conary Wiki. "An Introduction to the Conary Software Provisioning System" gives more details about the design of a Conary repository, while the Conary Day to Day page explains Conary from an end-user's perspective.

rPath Linux

The second piece of rPath's technology is rPath Linux, a distribution noteworthy chiefly for using Conary. Otherwise, from an end user's perspective, it is a relatively undistinguished distribution. Using the Anaconda installer, it installs a generic set of fairly recent packages, and sets up a firewall. Its hardware detection seems to be a little weak, since it is the first distribution in several years that failed to detect the pointing device on my old laptop.

To be fair, though, this lack of distinction is understandable. Although perfectly usable, rPath Linux was not primarily designed for everyday computing. Instead, its main role is as a demonstration of Conary and as a basis for rBuilder. As a basis for specialized builds, rPath Linux has to be as general as possible so that it can be customized for a variety of uses.

rPath Linux, licensed under the GNU General Public License (GPL), was released on February 6, with versions for x86 and AMD64. According to Keith Boswell, rPath's vice president of marketing, rPath Linux has had about a thousand downloads since then.

Even as an arena for viewing Conary in action, rPath Linux is less than ideal. Because the first version of the distribution was released only two months ago, the latest version is still largely in sync with its repository. As a result, you may have trouble finding changesets to run Conary through some of its paces. The solution is to install an earlier version, or to check the rBuilder Online page for rPath Linux to find the latest changes to the current release. Presumably, though, this problem will disappear over time.

An alternative solution is to try one of the distributions built from rPath Linux using rBuilder Online. Troan suggests Foresight Linux, a distribution dedicated to supplying the latest software for GNOME. He explains that the developers of Foresight Linux were among the earliest adopters of rPath's technology. "And what's really neat about the project," Troan says, "is that the core of it is just a couple of guys doing it in their spare time. But, because they're using our technology, they don't have to worry about the kernel, the C libraries, or compiler stack; they go right at the pieces they're interested in. They pull out the GNOME desktop that we ship, because they want something different, and they put theirs on top."

rBuilder Online

rBuilder is rPath's Web-based tool for building a customized software with a Conary repository. rBuilder offers clients reduced development time in general, and reduced time for multiple operating systems or architectures specifically. By using rBuilder, the project suggests, rPath's clients can move more easily from a traditional licensing model to a subscription model based on frequent updates.

rBuilder itself is not freely available. However, rBuilder Online is a Web-based version that is free for any non-commercial use under its own license.

Built like a wizard or online tutorial, rBuilder Online steps registered users through the process of setting up their own Conary repository and building a custom distribution or application with rPath Linux. The main difference from the regular rBuilder is that, when a project is created, tools for community-based development are added, such as mailing lists.

In addition, project pages in rBuilder Online provide an interface for viewing a sample Conary repository. Each project's main page shows recent versions and commits, as well as a complete release history and a list of files. Selecting an individual package shows similar information, as well as a changelog. Besides its practical purpose, this information is handy when you are learning how a repository is structured and what you can do with Conary. Currently rBuilder Online has 250 registered FOSS projects.

The future of rPath and package management

Official releases of the rPath toolkit have only been available for a couple of months, yet Troan is already upbeat about the company's future. He expects the company to announce the availability of images of rPath Linux for USB drives and hypervisors soon, including VMware and Xen. In fact, rPath recently announced a bonus for any winners of the VMware Ultimate Virtual Appliance Challenge that use rBuilder Online. Troan also talks about build systems that would provide an alternative to building locally with "reference builds, so that [people] can understand what can possibly go wrong with the images that they're using and get really high, reproducible results."

Are the general concepts behind rPath's tools are so logical that they should have been developed long before? Troan says no. Not only are the engineering details difficult to work out, he says, but the technology and the market to support them did not exist until recently. Specifically, he suggests that rPath's tools could not have been developed without a well-accepted FOSS operating system and well-supported commodity hardware.

Nor would the simplifying of releases be as attractive if software were still focused on a traditional licensing model; release management may be time-consuming, Troan says, "but that's your business, so it's kind of OK." However, the trend towards subscription software services, in which revenue depends on more frequent releases, makes more efficient release management a higher priority. The trend towards virtualization, he adds, similarly adds to the demand for more releases.

rPath was named one of three "Ones to Watch" at the Open Source Business Conference in February, but whether rPath or its tools will survive in the long term is impossible to predict. For now, rPath offers a comprehensive critique of existing package systems and delivers a carefully thought-out alternative to them. Conary and rBuilder, or tools like them, are likely to become part of other package systems within a few years.

Bruce Byfield is a course designer and instructor, and a computer journalist who writes regularly for NewsForge, Linux.com, and IT Manager's Journal.


  • Linux
Click Here!