July 27, 2004

A 'Linux Desktop Base' could help solve dependency problems

Author: Arvind Narayanan

The package installation problem is one of the primary barriers to
desktop Linux adoption. Most if not all solutions so far have addressed the wrong problem
(at least for desktop users) -- resolving dependencies at package
installation time. A much better approach is to
ensure that as few dependencies exist as possible. While this might
seem a lofty goal, given the open source development emphasis
on reusing as much code as possible, I believe this goal is indeed achievable
through a process of desktop component standardization.

The dependency problem is well analyzed and well understood (for
instance see Mike
Hearn's writeup
). Therefore, I will restrict
myself to a quick overview. Software packages have dependencies on
other packages,
which makes the user's task of installation more difficult. The problem
is much worse for binary than for source packages:

  • Even library versions differing by a minor version are frequently
    not binary compatible
  • While compiling from source, it is possible to detect if a library exists
    and compile accordingly, but this is not possible with binaries

The problem is aggravated by inconsistencies between different
distributions.

  • Choice of which packages to include and which version to use
    differs from distro to distro.
  • They often give different names for packages, so it is
    difficult to reliably check if a package is installed

The open source culture of reusing code also contributes to the
severity of the problem. Reuse has the desirable properties of reducing
development effort and improving security, but increases dependency
between packages. This is one reason why the dependency problem is more
severe on the Linux desktop than with other OSes, the other reason being
a lack of centralized control.

Current solutions center around packaging formats like rpm and deb,
which are useful for encapsulating package metadata, and package
repositories which are built on top of these package formats.
Repositories offer many advantages, including quality assurance and
ease of automation, and as such are an excellent solution for the
server.

For the desktop, however, the repository approach has several
shortcomings, and it is unclear if it solves anything at all. (We are
looking at a "grandma" or "Aunt Tillie" user.) Firstly, for the majority
of computer users who are on dialup, initiating a repository
installation and waiting for an indefinitely long time is simply not an
option. It also leaves less control in the hands of the app authors, who
are at the mercy of the different repository maintainers to do their packaging
for them. In practice, only the most popular software gets packaged. And
another problem is that once the user starts mixing packages installed
using the packaging system and without, they start down a slippery
slope toward system chaos.

Thus, even though there are many who delude themselves into believing
that repositories have sorted out the dependency problem, the actual
situation is pathetic: developers are forced to make packages for each
version of each distro they want to support, or rely on volunteers to
do so, and leave users to manually resolve dependencies.

I feel these solutions focus on the wrong problem --
resolving dependencies.The diversity of distributions and the number of dependencies makes it a daunting task to create a dependency resolution mechanism that is both distro-neutral from the application developer's point of view and works painlessly for the average PC end user. We must focus on a way to avoid dependencies.
(Havoc Pennington pointed
this out
recently.)

Let us now look at a few of the features we would like our ideal
system (for avoiding dependencies) to have:

Users would want to make major upgrades to their system as infrequently
as possible, say once in 5 years, to synchronize with their hardware
upgrade cycle. On the other hand, developers would like to make use of
the newest technologies, and would want users to keep their systems
perennially up to date, say once in six months. As a compromise, let us
say that our proposed solution would require users to upgrade their
system no more than once in two years.

Proposing a new distro or a new format obviously has little chance of
succeeding; it would only make things worse. Thus our solution should
be compatible with existing package formats and should be easy for
distros to implement. Finally it should seek to minimize user effort,
and should not require source compilation (even in a way that is
transparent to the user).

The Linux Desktop Base

The Linux Desktop Base is a proposal for a standardization of
packages, package names and package versions. It will be a community
driven standard,
hosted probably at
freedesktop.org. Its task will be to release a list of packages that
should be carried by all "LDB compliant" distros. A new version of the
standard will be released once a year. The obvious advantage of such a
standard is that a
developer can create a software package on any LDB compliant distro
with the assurance that it will install and run on any other LDB
compliant distro. If the differences in package versions
and package selection between distros is removed by standardization,
the dependency problem simply goes away, as we shall see in a while.

Some aspects of the current situation are ridiculous, such as the same
package being named differently by different distros, leading to
dependency problems; this happens only because of lack of coordination
between vendors, and the situation is clearly in need of
standardization. Other aspects of the proposal are less obvious; in the
following I will argue why it is a good idea to have standards for
package versions.

Several types of package specification will be used, depending on the
type of the software:

  • The exact version (both major and minor) of the package is
    specified (for libraries
    whose interface changes frequently).
  • The major version and a minimum minor version of a package is
    specified
    (for libraries which maintain backward compatibility within any single
    major version).
  • No version number is specified (for utilities like wget which are
    invoked as external commands).
  • Packages whose version number is specified but which are not
    mandated to be included in the distro.

Observe that the bug fix number (i.e, the 'x' in 1.2.x) is not
standardized. This is essential so that bugfixes can be made without
breaking LDB compatibility. Some packages do not use the standard
major.minor.bugfix version number format, in which case a concept of an
"interface version" which is separate from the package version can be
used, as done by autopackage.

It is obviously necessary to achieve the standardization in such a way
that the
distros still retain most or all of their freedom to customize their
offerings
according to their needs. Let us now see how this freedom will be
preserved:

  • Packages such as the kernel will not be standardized: very few
    desktop software depends directly on the kernel. Thus a tradeoff is
    made: any software that integrates too tightly into the system will
    have to be upgraded as part of the system.
  • Packages which have ambiguous licensing issues, such as mp3
    decoders, will not be standardized.
  • Packages are standardized with the minimum restrictions possible
    on the version number.
  • Even when the version number is fixed, the distro can still modify
    the library as much as it wants provided that the interface remains the same.
  • When possible, multiple versions of a
    library can be shipped.

Some examples of packages that will be standardized: glibc and other
core GNU libraries, because almost all software depends on them
directly or indirectly; GNOME and KDE libraries, and their language
bindings in a few of the most common languages; applications like wget
and most of the GNU toolchain, since a lot of other programs invoke
them. (Therefore, the last category will be standardized without
specifying a
version number.)

The standardization process

Once a year, the LDB
releases a 'core' package standard, which is the list of packages
to be included in any LDB compliant distro (along with version
specification). Two 'reference
implementations' are also released. One is a minimal LDB distro that
consists of the
minimal superset of the core packages required to have a usable desktop
system, and
nothing more. The second is a minimal development distro that consists
of a minimal superset of the core needed to have a self-contained Linux
system
for development. Thus, developers building and testing their packages
on the minimal LDB development distro are assured that their software
will install and work out of the box on any LDB compliant distro.

The process does not end here. Since only essential software is
included in the core, the benefits of standardization would not be
attained if all other libraries were left unstandardized. For every
non-core library, the maintainer is expected -- in co-ordination with
the LDB maintainers -- to standardize a version of his or her library for the
LDB specification.

What is the incentive for the library maintainer to
do this? Simply that app
developers who use this library and who target LDB based installation
would be able to use the library only if the library itself is
standardized. Of course, in the case of an open
source library, where the library authors do not have exclusive control
over it, a third party (such as an app developer using the library) may
offer to standardize it for LDB.

Each LDB release is maintained for about three years after the first
release. This allows enough distro deployment lead time for
vendors and customers to achieve the two year upgrade cycle discussed
earlier.

Branding is important from the end user perspective. Thus, an LDB
standard would be named "LDB 2005" or similar; compliant distros would
label themselves "LDB 2005 compliant". All the end user needs to
know about her system is that it is an LDB 2005 system; when she needs
to install a package, she looks to download the LDB 2005 version of the
package.

It is hoped that once a major vendor adopts the LDB, adoption would
progress rapidly due to "network effects", or Metcalfe's law. Only a
small critical mass of LDB compliant installations is required before
the difference in ease of installation puts serious commercial pressure
on other vendors to become LDB compliant. Let us now examine some
hurdles distros might face:
The GNOME and KDE
requirement.

This might seem controversial for two reasons. The first
is that distros wishing to carry only one of the major desktop
environments might not care for the extra bloat that LDB compliance
would necessitate. This is especially the case for Live CD distros. The
second is that the distro might want to offer support for only one of
the DEs
(like the recent UserLinux controversy.) But this is a non-issue
because the key point is that LDB compliance only requires the distros
to ship GNOME and KDE libraries
and not applications or the
DEs themselves. The libraries are small enough that it would only add
an overhead of a couple of hundred MB to a distro. This is a negligible
amount for distros other than Live CDs. As for Live CDs, users don't
install any new software on a Live CD so the LDB is not a factor for
Live CDs.

Another issue I would like to consider is Debian. Debian has its own
charter, philosophy and way of doing things, and id
not affected by commercial pressures; Debian packagers don't particularly care
about your grandma. Nothing wrong with that, of course, but it means
that the chances of Debian complying with the LDB are rather bleak.
However, there is no technical reason why other distros based on Debian and targeted for
the desktop should not be LDB compliant, and there is every commercial
reason for them to do so.

Turning to third party app developers, LDB makes life simple for them.
If
the package has no dependencies other than libraries in the LDB core,
then there is no problem at all. The developer simply builds
against the standard versions of all the libraries(for the appropriate
LDB target version), i.e on any LDB compliant distro. If there are
dependencies not in the core, then the developer knows the standard
version of the libraries that they (recursively) have dependencies on.
Therefore, they simply bundle the libraries along with the package. At
the user's end, at installation time, the process that needs to happen
is that for each bundled library, the system checks if it is already
installed, and if not, installs it. This logic could be built into the
system package installer, implemented by a wrapper script. Observe that
bundling in this case avoids the problems that plague MS-Windows: there
is a guarantee that multiple versions of the same library will not be
installed.

Delivery format. While package standardization removes the need to
build different versions for different distros, it does nothing about
the need for multiple package formats. LDB doesn't mean that you can do
away with rpms, what it means is that if you do create an rpm, it will
work on all rpm-based distros. Rpms and debs, then will be the vehicles
of package delivery. But that's still two
packages, one more than developers would like to have to handle.

There
are a couple of possible solutions: For simple packages, it might be
acceptable to simply deliver them as a tarball. This misses out on the
obvious advantages of packaging systems, such as uninstallation.
Another option is autopackage, which serves as a distribution
abstraction mechanism. Using autopackage has another advantage: Even
with LDB there would still be differences between distros due to
imperfect LSB (Linux Standards Base) compliance. This might cause package installation
problems which would be resolved by using autopackage. Although
dependencies are the major problem autopackage is intended to
solve, using autopackage in conjunction with LDB is an interesting idea
that would be worth considering.

Assuming that developers targets desktops up to two years
old, they would only have to maintain two packages (for each package
format offered) at any point.
This is radically different from the current situation which requires
one build for every version of every distro.

There's the issue of distros wanting to have faster release cycles than
the one year cycle of LDB would allow. This is an irreconcilable
problem, one that distros would have to deal with. There's demand for
slower release cycles from the wider desktop market, and for faster
ones from the hobbyist market. It makes sense, then, to split the
product line into two, similar to the split between the enterprise and
the home desktop market that has recently been recognized. The hobbyist
desktop would be a testing ground for new technologies and would not
worry about the LDB; hobbyist users, after all, are sophisticated
enough for package installation not to be a problem for them. Whether
the distro wants to support the hobbyist market at all is for it to
decide.

In conclusion, I feel that LDB adoption has the potential to greatly
enhance the viability of Linux on the desktop. It would, however, take
a significant commitment from vendors to set the ball rolling. One of
the lessons the fragmentation of proprietary Unix and the success
of Linux has taught us is that product differentiation is ultimately
bad for everyone; standardization of the platform and a service-based
revenue model is the way to go. The LDB is completely in keeping with
this spirit. I hope that major distros will see the merit of this and
adopt the LDB.

Click Here!