April 20, 2006

Boot faster with parallel starting services

If the slow booting time of your Linux box is driving you crazy, consider parallel booting techniques.

What kind of improvements might you see, compared to your current sequential boot process? Well, let's start by seeing what kind of times you're getting currently. Either grab a stopwatch, or write a simple script to time the boot:

#Filename: time_booting
HOST=$1
RESULT=""

while [ "$RESULT" != "$HOST" ]
do
	RESULT=$(ssh bainm@$HOST uname -n 2>&-)
done

Start your timing and press the PC's power button:

$ time time_booting hector

real    2m54.014s
user    0m47.140s
sys     0m18.660s

In this example, a simple server took nearly three minutes to boot. To see what kinds of improvements are possible, you need to look at the runlevel being used, and which applications are being fired up and in what order. Find the runlevel by typing sudo runlevel, and then look in the related rc.d directory:

$ sudo ls -l /etc/rc$(sudo runlevel| awk '{print $2}').d
total 0
lrwxrwxrwx  1 root root 18 1999-10-20 01:12 S10sysklogd -> ../init.d/sysklogd
lrwxrwxrwx  1 root root 15 1999-10-20 01:12 S11klogd -> ../init.d/klogd
lrwxrwxrwx  1 root root 13 1999-10-20 01:11 S14ppp -> ../init.d/ppp
lrwxrwxrwx  1 root root 17 1999-10-20 01:43 S18portmap -> ../init.d/portmap
lrwxrwxrwx  1 root root 24 1999-10-20 02:12 S20binfmt-support -> ../init.d/binfmt-support
lrwxrwxrwx  1 root root 15 1999-10-20 01:12 S20exim4 -> ../init.d/exim4
lrwxrwxrwx  1 root root 15 1999-10-20 01:11 S20inetd -> ../init.d/inetd
lrwxrwxrwx  1 root root 13 1999-10-20 01:43 S20lpd -> ../init.d/lpd
lrwxrwxrwx  1 root root 17 1999-10-20 01:10 S20makedev -> ../init.d/makedev
lrwxrwxrwx  1 root root 18 1999-10-20 03:33 S20mono-xsp -> ../init.d/mono-xsp
lrwxrwxrwx  1 root root 15 1999-10-20 12:54 S20mysql -> ../init.d/mysql
lrwxrwxrwx  1 root root 18 1999-10-20 01:44 S20netatalk -> ../init.d/netatalk
lrwxrwxrwx  1 root root 27 1999-10-20 01:44 S20nfs-kernel-server -> ../init.d/nfs-kernel-server
lrwxrwxrwx  1 root root 20 1999-10-20 01:42 S20postgresql -> ../init.d/postgresql
lrwxrwxrwx  1 root root 15 1999-10-20 01:44 S20samba -> ../init.d/samba
lrwxrwxrwx  1 root root 13 1999-10-20 01:43 S20ssh -> ../init.d/ssh
lrwxrwxrwx  1 root root 20 1999-10-20 01:43 S21nfs-common -> ../init.d/nfs-common
lrwxrwxrwx  1 root root 20 2006-04-05 15:44 S22set_time -> /etc/init.d/set_time
lrwxrwxrwx  1 root root 13 1999-10-20 01:12 S89atd -> ../init.d/atd
lrwxrwxrwx  1 root root 14 1999-10-20 01:11 S89cron -> ../init.d/cron
lrwxrwxrwx  1 root root 17 1999-10-20 01:43 S91apache2 -> ../init.d/apache2
lrwxrwxrwx  1 root root 19 1999-10-20 01:10 S99rmnologin -> ../init.d/rmnologin
lrwxrwxrwx  1 root root 23 1999-10-20 01:10 S99stop-bootlogd -> ../init.d/stop-bootlogd
If you want to learn more about the Linux booting process, read An introduction to services, runlevels, and rc.d scripts.

At boot time, script /etc/init.d/rc obtains the runlevel and looks in the appropriate directory (typically, something like /etc/rc<runlevel>.d). The script then runs all of the files in the directory in alphabetical order. Each file is actually a link, and its name is prefixed with the letter S and a number. This scheme creates a booting order and ensures that any dependencies are handled correctly. For example, in the previous listing, S10 has no dependencies and therefore runs first. S11 may depend on S10 and runs next -- and so on.

The example above shows 12 processes with an S20 prefix. These are fired off sequentially and in alphabetical order. If each one takes 10 seconds, that accounts for two minutes of the boot time. Since each process has the same dependency level, you theoretically could run them in parallel, turning a boot time of two minutes and 54 seconds into one minute and four seconds.

Making a brave move

Understanding why the boot process is slow is one thing, but doing something about it is another. Before you start making changes, make sure that you've backed everything up -- or even better, use a test Linux box. There's no guarantee that you won't do something wrong, and a broken booting process can mean a broken Linux box.

If you're confident you know what you're doing, start by having a look at your /etc/init.d/rc script. At the end of the file you'll find something like the following (although I've simplified it greatly):

for i in /etc/rc$runlevel.d/S*
do
 case "$runlevel" in
  0|6) startup $i stop ;;
  *) startup $i start ;;
 esac
done

You can cause the boot processes to start in parallel by making a minor change to the code:

*) startup $i start & ;;

After making this change, I ran my time check and got this result:

$ time time_booting hector

real    1m44.105s
user    0m0.613s
sys     0m0.674s

That's more than a minute faster.

Unfortunately, this technique has a potential drawback. In the normal boot, everything starts and completes in an ordered queue. By starting each process in parallel, you can't guarantee that one process won't get to a critical point before another that it relies upon has been completed. The technique might work for you, but perhaps more from sheer luck than careful planning.

Using cinit to ease the process

To avoid this potential problem, you need to modify the script so that it checks that any dependencies have been met before a service is fired off. You can simplify this process by installing cinit, a program designed to enable parallel booting of processes and handle application dependencies. However, cinit is not for the fainthearted or newbies. You can't simply swap from init to cinit -- you really need to know what you're doing. That said, if you do know what you're doing, then cinit can be a big help.

The basic idea behind cinit is simple -- gone is the directory containing an ordered set of links to the applications you need to run. Instead, cinit provides a set of directories (in /etc/cinit) -- one for each service to be run. Each directory contains a link (called on) that points to the application to be run. It also features two subdirectories: needs contains links to services that are critical to your new one, and wants contains links to the nonessential services you would like running before your new one. The on.param file contains (or can be a link to) parameters required by the service.

That's cinit in a nutshell. Let's look at an example:

$ ls -l /etc/cinit/pcmcia/modprobe/
lrwxrwxrwx  1 bainm bainm 14 2006-04-06 18:07 on -> /sbin/modprobe
-rw-r--r--  1 bainm bainm  6 2006-04-06 18:09 on.params
$ cat /etc/cinit/pcmcia_module/on.params
pcmcia

You can see that the modprobe service has no dependencies. However, the next example (cardmgr) needs the modprobe service to be in place before it can run:

$ ls -l /etc/cinit/pcmcia/cardmgr/*
lrwxrwxrwx  1 bainm bainm   13 2006-04-06 18:20 pcmcia/cardmgr/on -> /sbin/cardmgr

pcmcia/cardmgr/needs:
total 0
lrwxrwxrwx  1 bainm bainm 20 2006-04-06 18:22 pcmcia_modules -> /etc/cinit/modprobe/

You can read more about configuring cinit on Nico Schottelius' Web site. The site also contains a number of sample configurations.

After two days of working with cinit, I managed to get this result:

$ time time_booting hector

real    2m03.003s
user    0m0.841s
sys     0m0.762s

, another product that uses parallel starting services, is another option worth checking out. It's still under development, and lacks things such as cron support, but it looks like it could become something that everyday users can use rather than just experts.

Depending on how well you understand Linux and how desperate you are, you can shave down your booting time. You can go for a quick but potentially risky method, or you can put some time and effort into a more robust answer.

Click Here!