January 24, 2008

Use kfsmd to keep track of changes in your filesystems

Author: Ben Martin

Applications can ask the Linux kernel to report changes to selected files and directories. I created the Kernel Filesystem Monitoring Daemon (kfsmd) to make monitoring filesystem changes simple.

There are packages available for both 32- and 64-bit Fedora 7 and 8 and Ubuntu 7.10 Gutsy, as well as 32-bit packages for openSUSE 10.3. You can also download a repo
file
, which can be used with Fedora 8 and yum. Placing the repo file into /etc/yum.repos.d allows you to install kfsmd and its dependencies with yum install kfsmd on a Fedora 8 machine. You can also compile directly from source if that is your preference.

Command-line clients for kfsmd come in two categories: monitoring and logging. The monitoring client produces output on the console whenever something happens to a filesystem you are watching. You can log to either a Berkeley DB4 file or a PostgreSQL database.

The following session shows a simple directory monitoring session using kfsmd. It creates and populates a temporary directory, then starts kfsmd-cat to watch /tmp/k for any filesystem changes. The main command-line parameter is the watch command, which takes the directory or file to watch as a single argument. While kfsmd-cat was running, I opened a second terminal and created the df5.txt file and then removed it. These actions were reported to the console by kfsmd.

$ mkdir /tmp/k
$ cd /tmp/k
$ date > df1.txt
$ date > df2.txt
$ kernel-filesystem-monitor-daemon-cat -v watch .
setting up watch for:.
setting up watches
calling run
event on wd:1 . filename:df5.txt
CLOSE URL:./df5.txt
event on wd:1 . filename:df5.txt
DELETE_FILE URL:./df5.txt

If you specify a directory to monitor with a full filesystem path, then kfsmd also monitors existing and newly created subdirectories by default. You can use the ignorepfx argument to limit these recursive monitors by explicitly telling kfsmd not to monitor some subdirectories. In the next example, which uses ignorepfx, I created two subdirectories inside /tmp/k: junk1 and subdir1. Both of the directory create and delete events were reported by kfsmd, and because of the ignorepfx argument, kfsmd did not monitor the /tmp/k/junk1 subdirectory itself, so files I created in that directory were not monitored and reported by kfsmd. Note that as ignorepfx is the prefix of a path, using just "junk" means that the subdirectory junk1 is not monitored.

$ kernel-filesystem-monitor-daemon-cat -v \
watch /tmp/k ignorepfx /tmp/k/junk

event on wd:1 /tmp/k filename:junk1
CREATE URL:/tmp/k/junk1
event on wd:1 /tmp/k filename:subdir1
CREATE URL:/tmp/k/subdir1
should adding monitor for:subdir1
event on wd:2 /tmp/k/subdir1 filename:subfileA.txt
CREATE URL:/tmp/k/subdir1/subfileA.txt
event on wd:2 /tmp/k/subdir1 filename:subfileA.txt
CLOSE URL:/tmp/k/subdir1/subfileA.txt
event on wd:2 /tmp/k/subdir1 filename:subfileA.txt
DELETE URL:/tmp/k/subdir1/subfileA.txt
event on wd:1 /tmp/k filename:subdir1
DELETE URL:/tmp/k/subdir1
event on wd:1 /tmp/k filename:junk1
DELETE URL:/tmp/k/junk1

You can see that filesystem changes reported by kfsmd have a regular style. The primary report has the prefix "EVENT_TYPE URL:" where the event type is what happened to the file and the URL: string is used as a direct prefix to the file path being reported. This structure makes it convenient to use the kfsmd-cat command and pipe the output into a script that will perform some action for you automatically when files change.

The following script uses Perl to print the paths of files that are deleted in any monitored directory. It uses the pwd command on the first line of the command to make the paths reported by kfsmd absolute. The kfsmd-cat command will produce output similar to that shown above, and Perl code massages the output into a particular format, or can execute a command whenever a deletion happens. The script ignores all lines that do not start with DELET. Lines which report file deletion then have the prefix string "anything...URL:" stripped off so only the file path is printed to the console.

$ kernel-filesystem-monitor-daemon-cat watch `pwd` \
| perl -ne '{ if( /^DELET/ ) { s/.*URL://g; print; } }'
/tmp/k/df5.txt

A second invocation of kfsmd-cat, shown below, sends an email message whenever a file is deleted in the current directory.

$ kernel-filesystem-monitor-daemon-cat watch `pwd` \
| kfsmd-sendemail.pl

The Perl script kfsmd-sendemail.pl is shown below. The three lines which you might have to change to use this yourself are listed at the top of the script; the FromAddress and ToAddress should be modified to suit your local environment.

#!/usr/bin/perl -n

$Mailer = "| /usr/sbin/sendmail -t";
$FromAddress = 'ben@localhost';
$ToAddress = 'ben@localhost';

if( /^DELETE_/ ) {
s/.*URL://g;
chomp;
$url=$_;
$now=`date`;
open MAIL,"$Mailer";
print MAIL <<THE_EMAIL;
From: $FromAddress
To: $ToAddress
Subject: KFSMD: A file was deleted

The file: $url
Was deleted at $now
THE_EMAIL
close MAIL;
}

Logging with kfsmd

To log filesystem events into a PostgreSQL database, use the kfsmd-postgresql command. Before using this command you must set up the database with the postgresql-schema.sql script. This can be done using the first command shown below. Note that the the database setup command only needs to be run once. The kfsmd-postgresql daemon will run in the background by default. You can set up watches with it in the same manner as for the kfsmd-cat command, though you must specify the database host and database name.

$ cat postgresql-schema.sql | psql -h myPostgreSQLServer
$ kernel-filesystem-monitor-daemon-postgresql \
-h myPostgreSQLServer \
-d kernel_filesystem_monitor_daemon_postgresql watch `pwd`

With the command above, filesystem changes are logged to the PostgreSQL database called kernel_filesystem_monitor_daemon_postgresql on the server myPostgreSQLServer. You can query the database using SQL as shown below.

$ psql -h myPostgreSQLServer
# \c kernel_filesystem_monitor_daemon_postgresql
# select * from dirs d, events e where d.pwd = e.pwd order by time;
pwd | url | id | mask | pwd | time | name
-----+--------+----+------+-----+----------------------------+----------
1 | /tmp/k | 2 | 512 | 1 | 2008-01-03 13:26:39.913456 | df11.txt
1 | /tmp/k | 1 | 8 | 1 | 2008-01-03 13:26:39.913456 | df11.txt
1 | /tmp/k | 3 | 8 | 1 | 2008-01-03 13:26:41.917512 | df21.txt

There are many situations where kfsmd is the right tool for the job. For example, if you are editing a file and wish to automatically publish it to a remote server each time it is saved, you can use kfsmd-cat and when a CLOSED event is detected execute a little script to rsync the file to the server. If you have a long-running task and wish to know when it is completed, just monitor for a filesystem change that occurs at the end of the process, such as when a file download or a build completes.

The kfsmd-postgresql and kfsmd-stldb4 commands allow you to easily record filesystem changes in a database, which is great for auditing what happened and when.

Category:

  • Tools & Utilities