Automatically process new files with fsniper

556

Author: Ben Martin

fsniper lets you monitor specified directories and execute scripts on any new files that are created in them. Because fsniper uses inotify to monitor its directories, the actions you define are executed as soon as filesystem changes happen. This makes fsniper both more immediate than an hourly cron job and more efficient.

One possibility that suggests itself when you think about automatically processing files as they are placed in a directory is to have some sort of classification of files that you download from the Web. In fact, this is the first example that the fsniper Web site gives.

Another use case presents itself in the context of security. Suppose you have a collection of media files that you share read-only on a file server. This security makes it impossible for a user to add new files to the server. With fsniper, you can share an “upload” directory and have scripts move files that are uploaded to it across to the read-only fileserver automatically. If the client machine has a problem it could write random files to the upload directory, but then you have control of new files in the upload directory with the scripts that fsniper calls. For example, your move scripts might check that the file names are sane and that the file types are valid, not allowing random files, only valid JPEG files. At the very least your move scripts might block overwriting files or check the old version of a file into a revision control system before overwriting it.

As an added bonus, because fsniper uses a script that you create to perform the file move, you can change the file protection, log the new file to a database, send an email, or update an RSS newsfeed so that interested parties know about the new file right away. The scripts used by fsniper can also use sudo and other features to execute actions on the files that the users themselves cannot perform, such as changing the user and owner of a file before moving it.

fsniper is not packaged for openSUSE 11, Ubuntu Hardy, or Fedora 9, but you can build it from source using the normal ./configure && make && sudo make install process. fsniper uses libmagic to determine what sort of data is in a file, so you’ll need to first install the file-devel package if you’re running Fedora 9 or openSUSE 11, or libmagic-dev for Hardy.

If you execute fsniper at this stage you will get an error message telling you that the configuration file ~/.config/fsniper/config was not found, and fsniper will exit. The fsniper tarball includes an example configuration file, example.conf, to help you get up and running. The format of the configuration file is nested scopes delimited by {} characters. The outermost scopes define watches. The first scope inside a watch defines the directory to watch, and the innermost scope defines match patterns that are executed whenever something changes in the directory that the watch is observing.

An example like the one below, from the example.conf file, may help visualize the pattern. First a watch scope is defined and the ~/drop scope gives the path that this watch will observe. Inside the ~/drop scope the image/* scope will be executed on files that have been written to and closed in the ~/drop directory and that have a MIME type starting with image/. To make things clearer I’ll refer to the outermost scope as the watch scope, the next one in that defines the directory path the directory scope, and the innermost as the match scope.

watch { # watch the ~/drop directory for new files ~/drop { # matches any mimetype beginning with image/ image/* { # %% is replaced with the filename of the new file handler = echo found an image: %% } ... } }

The directory scope is not recursive by default, so if you created a new image file inside ~/drop/foo, the above match scope would not run on it. You can change this by setting recurse = true as the first setting for a directory scope.

The handler setting in a match scope defines what command to execute if the match scope succeeds. In the above example it displays the path on the terminal that you started fsniper in if any new images are found. fsniper replaces the %% variable with the path to the file that the watch found. You can also use %f if you just want the file name, and %d if you only want the path of the directory that contains the file. If your handler line does not include any of these special path characters, fsniper will append the path to the file as the final argument to whatever command you specify.

While the handler defined in the above match scope is a command to execute, you might like to use multiple commands in scripts and call them from your match scopes. This separates the functionality from the configuration and allows you to define more complex commands or use Perl scripts. fsniper prepends ~/.config/fsniper/scripts to your $PATH before executing the match scope, so you can use simple script names and not bother specifying the full path for each custom script you use, as long as you add new scripts to ~/.config/fsniper/scripts.

You might wonder why you define the command inside a handler setting instead of directly inside the match scope itself. It’s because fsniper allows you to chain multiple handlers together for a match scope. If the return code of a handler is zero, then fsniper assumes that the handler has fully taken care of the file and does not execute any subsequent handlers inside the same match scope. If the return code is 2, fsniper assumes that the handler wants to handle the event but cannot do so right now. This is handy in cases dealing with networked computers, because the handler might need to contact another machine that is temporarily down. If the handler returns a value other than 0 or 2, fsniper executes the next handler in the match block. This way you can have a script that is very selective about the file it wants to handle, and if the file is not right the script can just return 1 to have fsniper use the next handler for this file.

The order you specify your handler lines in a match scope makes a difference, because the return code of any handler affects whether the next one is executed.The match scopes are also executed top to bottom in the order you specify them. Only one match scope is executed for an event, so it is best to place the most specific match scopes at the top of your directory scope and the most generic ones at the bottom.

The doc/doc.txt file includes information about using fsniper with a Firefox download directory. The complication is that Firefox creates both a zero-byte file and a file.part file which is used to contain the data as it is downloaded. You have to set up fsniper so that your script will be executed only when the file.part is renamed to file after the download is complete.

As a word of warning, I found that comments inside match scopes caused fsniper to crash on startup, and comments above match scopes caused fsniper not to execute the directory scope correctly. I tried this with tab-indented and non-indented comments; the only way to get the correct behavior was to strip out the comments completely.

With fsniper you can set up automatic file processing without any delays. As an added bonus, these processes can be executed with greater privileges than those the user who writes the files has. For example, you could create a special user on a file server that has the ability to change file ownership. In any event, file processing becomes a matter of following the rules once you have set up your fsniper directories and scripts.

Categories:

  • System Administration
  • Tools & Utilities