September 8, 2005

Locating files in real-time with rlocate

Author: Carlos J. G. Duarte

A few months ago Joe Barr introduced locate and friends, and included a brief reference to rlocate. rlocate, by Rasto Levrinc, is based on slocate, which is an improvement on traditional locate, an old Unix command used to perform fast pathname searches. Besides adding a few commodity options, like the -i argument for case-insensitive search, rlocate's main feature is secure path searching, which presents only paths to the user that he has permissions to.rlocate updates its path database in real-time by using a kernel module that intercepts all paths modifications and a daemon that logs those operations on a differences file. The combination of the full path database with the difference file gives you instant updated information on the filesystem.

Installation

rlocate runs only on Linux 2.6.x series kernels, and requires specific kernel flags under "Default Linux Capabilities." Therefore, you might need to recompile your kernel before you can run rlocate.

At the system level, rlocate requires:

  • Its own group, for security issues. To add the proper group, run the command groupadd rlocate.
  • A kernel module rlocate.ko, which should reside at /lib/modules/2.6.x/misc/rlocate.ko. To load it, run the command modprobe rlocate.
  • A character device, so the userspace daemon, which ensures that rlocate constantly updates its information, can communicate with the kernel. You can create the device with the command mknod /dev/rlocate c 253 0.

The modprobe rlocate command creates an entry in the /proc filesystem with some statistics:

# cat /proc/rlocate
version: 0.3.1
excludedir: */proc**/dev*
activated: 1
startingpath: /
output: /usr/local/var/rlocate/rlocate.db
updatedb: 0

At utility level, there are two main binaries to care about. rlocated is the daemon that maintains a database of path differences based on filesystem activity. A difference is a line describing what happen with a given path -- if it was created, deleted, moved, etc. It should be launched somewhere in the boot scripts (preceded by a modprobe rlocate). rlocate is the user utility that does all the work -- creating, updating, and querying the path database.

Updating the database

rlocate -u creates and updates the full path database. Theoretically it could be run only once. However, it's best to run it periodically, in a crontab job for example. When an update is run, the differences database is reintegrated on the main path-database, speeding subsequent searches. If the update is never run, the differences database will keep growing in size. As each search must process the main path-database and then the differences database in order to obtain fresh information, the net result would be a search slow down.

Here's an example of a real live update operation:

# /usr/local/bin/rlocate -u -e /proc,/dev -f NFS,iso9660,smbfs,ncpfs

-u specifies that rlocate is running on update mode; without this argument the comand defaults to query mode.
-e specifies the paths to exclude.
-f specifies the filesystems to exclude.

There are in fact two update modes. A full update behaves like the traditional locate command, performing a full scan on the filesystem. A fast update just synchronizes the differences database back on the main path database. Although both modes can be forced by command-line flags (--full-update and --fast-update respectively), the default update operation automatically chooses the mode it will run in.

By default, for each 10 rlocate -u operations, the first 9 are fast and the tenth is a full one. Actually, the full scan is done when the updatedb line of /proc/rlocate reaches 0 (see above).

Using rlocate

Et voilĂ , now that it's properly installed, rlocate behaves exactly the same way as slocate or locate does, except that any modification you make to the filesystem is immediately taken in account. For instance:

# rlocate new-fresh-file
[nothing-- no matches]

# touch new-fresh-file
# rlocate new-fresh-file
/var/tmp/new-fresh-file

# rm -f /var/tmp/new-fresh-file
# rlocate new-fresh-file
[nothing-- no matches]

Since rlocate is based on slocate, it shares the same addons. Here's a regular expression query that detects paths with extension repeated:

# touch double-ext.xx.xx
# rlocate -r '\.\([^.][^.]*\)\.\1$' | fgrep xx
/var/tmp/double-ext.xx.xx

While rlocate, at version 0.3.1, is still in beta release, I have found it to be stable and reliable.

rlocate is a big improvement over traditional locate software, because it solves the main weakness of the later: the pathname database's slow and heavy update process, which is now made in real-time. Even though it requires kernel configuration and comprises a few more components than traditional locate, this complexity is well worth the effort because of the real-time path searching capabilities rlocate offers.

Click Here!