November 10, 2005

New approaches to Linux package management

Author: Irfan Habib

Traditional Linux package management systems such as RPM, Debian's dpkg, and Slackware's pkgtool present several problems for users. Users who want optimized packages often have problems finding them, different package repositories have conflicting naming conventions, and binary packages are often not available for packages in a timely fashion. However, for users willing to stray from the beaten path, there are alternatives. Two projects have taken up the challenge of making a package management system that overcomes these shortcomings.

Linux is available for a number of architectures, which is one of its advantages. However, most packages are generally compiled only for i386, or i686 -- leaving users of alternative architectures with a dilemma. Users wanting packages optimized for other architectures often have to wait longer for pre-compiled packages, or resort to compiling the packages from source on their own, which defeats the purpose of using a package manager.

Availability of binary packages is also a problem for certain projects. I am an avid fan of KDE and a Slackware user. Every time a new version of KDE is released it takes a week or two until the Slackware packages are available, and the alternative, compiling KDE from source, with all its dependencies, can take an entire day.

Another problem with existing package management systems is that they contain scripts that automate the installation. However, if the scripts contain bugs, then users may need to employ complicated workarounds to rectify the problems. Even worse, if a package's installation script fails, it might be impossible to remove the package without extensive manual effort.

A new breed of distributed package management systems have emerged which make use of Internet-based repositories to give users the packages they require, customized for each user's architecture.

Gentoo Portage

The Portage package management system is a central feature of the Gentoo Linux distribution. Portage, which takes after the Ports system used with *BSD distributions, is a pioneer in the distributed package management paradigm for GNU/Linux distributions.

Portage uses the rsync protocol to update its tree. Updating all of the software in the entire distribution is as simple as entering the command:

emerge --sync

This command updates the local Portage tree with the current Portage tree for Gentoo. To update all of the packages on your system, run emerge -u world.

What about installing new software? To search for a specific package, simply run emerge --search packagename.

Portage can also handle dependencies automatically. To install a package, run:

emerge packagename

This emerge command tells Portage to download the source code of a specified application, as well as all other applications or libraries needed to satisfy its dependencies. Once downloaded, everything is compiled from source. You can optimize the compilation settings through the CFLAGS environment variable, based on the specifications of the individual computer, and on the individual user's need for speed.

One drawback of Portage is that it relies on the Gentoo Portage tree. If a package is not available in the tree, then a user cannot use Portage to install the package, and must resort to compilation by source.

Portage is one of the innovations that has enabled Gentoo to gain a large user-base very quickly, and it solves several of the problems found with traditional package managers.

The Conary Software Provisioning System

The Conary Software Provisioning System is developed by rPath, a company founded by ex-Red Hat engineers. Conary applies new ideas from distributed configuration management tools such as GNU arch and monotone.

According to the designers, "Conary uses networked repositories containing a structured version hierarchy of all the files and organized sets of files in a distribution."

Rather than concentrating on separate package files in the manner of traditional package managers, Conary relies on a tree-like structure that resembles to a software configuration control system. The Conary tree, unlike the Portage tree, can be distributed across a network. That is, a package compiled for i386 can be kept in the repository at rpath.com, and someone else might maintain the package for an alternate platform in another repository. The Conary tree can span multiple repositories, and hence potentially provide access to a vast array of packages.

Conary is able to utilize binaries if available, or source if necessary, and stores all version information in a database in order to track changes from the source branch all the way back to changes in the local versions installed on a given system to meet dependencies without conflicts. In some ways, Conary acts as a revision control system for package management.

Conary's command-line syntax is very APT-like, and the GUI front end is similar to Synaptic. Conary is designed for the layman, and hence it is easy to use. For example, here are some of the commands for popular operations with package managers:

The command conary q packagename shows whether a given package is installed.

The command conary rq packagename lists the newest available upgrade.

The command conary update packagename installs or updates the requested package.

The command conary erase packagename uninstalls the package.

Say I want to upgrade KDE on a Conary-based system. Running conary update KDE resolves all dependencies and updates all of the necessary packages for KDE.

Conary is an exciting tool for users who are frustrated with traditional package management systems. At least two distributions make use of Conary for package management: rPath, which was created by the designers of Conary, and Foresight Linux .

Conclusion

Conary and Portage were designed to address many of the limitations of traditional package manager. The Linux developer and user base has grown enormously over the past decade, and packaging systems have not kept pace. Many do not scale well to multiple repositories with conflicting or overlapping content, which can make it difficult for developers on different projects to coordinate package releases. Additionally, the increasing number of dependencies in many open source packages poses unique challenges to open source package managers.

One of the main aims of Conary is that it should be designed to enable a loosely coupled Internet-based collaborative approach to building Linux distributions that can change almost any aspect of a Linux system. With this kind of approach, Linux users might be able to manage their distributions with ease and not have to change their operating systems in order to circumvent the frustrations of a traditional package management system.

With the new breed of package managers users can now relegate responsibility for management of all aspects of software installation, upgrades, and removal to the package manager. With any luck, the days when users have to scour the Internet for optimized packages or missing dependencies will soon be over.