Author: Ben Martin
Libferris is a virtual filesystem with index and search capabilities that allows you to geotag files. One reason the name libferris was chosen is because “ferris” sounds similar to VFS when said as “vee-fsss.” A major advantage of using libferris to geotag is that you can tag anything that the virtual filesystem knows about. Examples of things that can be mounted and geotagged include email messages in Evolution, Web sites,
file:// URLs, a tuple in a PostgreSQL database, or an arbitrary triple in an RDF store. Libferris automatically handles mounting; for instance, by accessing the
evolution:// URL, libferris will automatically contact the Evolution data server and mount your mail.
Different applications refer to geotags by different names. In Google Earth they are called placemarks; in libferris they are special emblems. In this article I use the three terms placemark, emblem, and geotag interchangeably.
Setting up geotags
The easiest way to set up your geotags in libferris is to use Google Earth. Open Google Earth and create placemarks at locations of interest. As an ongoing example, I’ll assume we are tagging files with two locations; the Zwinger Palace and Albert Platz in Dresden, Germany.
Libferris provides a special Filesystem in User Space (FUSE) mountpoint designed for Google Earth integration. To use it, begin by running the ferris-mount-etagere-as-kml.sh script, which will create a virtual file in the ~/ferrisfuse/etagere-as-kml directory. To export your placemarks from Google Earth into libferris, right-click on “My Places” in Google Earth and save over the virtual KML file in ~/ferrisfuse/etagere-as-kml. If you add more placemarks later on, you can save over this file again and libferris will import your new locations, leaving your existing geotags intact.
There are many ways to geotag your files with libferris. The
fmedallion command line tool can associate your geotags with files. The
ego file manager presents your geotags in a side panel and includes a type buffer to allow tagging multiple files in one action.
The following shell commands tag two files, zwinger_1.jpg and dresden_albert_platz_1.jpg. The file names are not important once the files are geotagged. Note that because geotags are implemented as special emblems, we use the
emblem-name option to identify them.
$ fmedallion --add --emblem-name "Albert Platz" dresden_albert_platz_1.jpg $ fmedallion --add --emblem-name "Zwinger" zwinger_1.jpg $ fmedallion --list zwinger_1.jpg Zwinger
|Ego tagging – click to enlarge|
The figure at right shows geotagging with
ego. The geotag emblems are located under the “libferris/libferris-geospatial” emblem in the side panel. In ego you can enter a special type buffer mode using the scheme code
(hk-start-interactive-emblem) which will complete your emblem name when you press Tab. All your emblem names are potential candidates for tab completion. For example, you are likely to only have one emblem starting with “Zw,” so typing Zw and then Tab would complete to Zwinger.
The type buffer mode also supports disambiguation of geotags. For example, a typical arrangement of geotags is to create a hierarchy in Google Earth, with a top level for countries, each of which has a state or city as subdirectories. It might follow then that the Berlin geotag contains a Hauptbahnhof (central railway station) geotag, and the Dresden geotag also has a Hauptbahnhof geotag. The placemark hierarchy might look like:
Germany -- Berlin -- -- Hauptbahnhof -- Dresden -- -- Hauptbahnhof
Many cities will have a central station, and as you have arranged your placemarks into a hierarchy you want to just call the Hauptbahnhof placemark “Hauptbahnhof” because it is already inside the Dresden placemark to indicate which city it belongs too. This arrangement will lead to many geotags having the same name. Using type buffer mode in the above example, typing Haupt and then Tab would display the options “Hauptbahnhof (Berlin)” and “Hauptbahnhof (Dresden),” allowing you to explicitly select which station you desire.
Index and search
Libferris handles its indexing using a plugin system. Different plugins provide indexing with different flexibility and performance tradeoffs. For example, a plugin might sacrifice some performance in order to better support dynamic updates to the index. As of libferris version 1.1.97 there are index plugins for PostgreSQL, Lucene, LDAP, ODBC, Redland (RDF), Xapian, Beagle, and Yahoo!, as well as federations and arbitrary external programs. A federation allows a group of indexes to be used together to resolve a query.
To get started with indexing and libferris, I’ll assume you’re using the PostgreSQL index plugin. There are two types of indexes that libferris can maintain: full text and metadata. For historical reasons, metadata indexes are also referred to as extended attribute (EA) indexes. The following instructions will set up both indexes in a single PostgreSQL database and link the two together within libferris. By linking the index types together, you can perform full text and metadata searches in a single query.
To begin, you must run the ferris-setup-template-findex-database.sh script as the database administrator to set up some template databases. Then, as a normal user, run the ferris-recreate-primary-fulltext-and-eaindex-as-postgresql.sh script, giving the name of the database in PostgreSQL you wish to use for your libferris index as the first and only argument. The script will warn you that the named database will be dropped and recreated and ask for a confirmation.
The template databases are required because libferris takes advantage of PL/pgSQL and Generalized Index Search Trees (GIST). Adding these things to a database requires administrator access. Once a template is set up with these, then normal users can create new databases using that template.
Tell libferris to immediately index the two example files with the following command:
$ feaindexadd zwinger_1.jpg dresden_albert_platz_1.jpg
You can now retrieve files using their geotags. The first command below will find all your files that are tagged with “Zwinger.” The second command will find files that are geotagged with any placemarks within 50 kilometers of the Zwinger.
$ feaindexquery "(emblem:has-Zwinger==1)" Found 1 matches at the following locations: file:///tmp/geotag/zwinger_1.jpg $ feaindexquery "(ferris-near-Zwinger<=50km)" Found 2 matches at the following locations: file:///tmp/geotag/dresden_albert_platz_1.jpg file:///tmp/geotag/zwinger_1.jpg $ ferrisls -lh "eaq://(emblem:has-Zwinger==1)" -rw-r----- ben ben 838.1k 06 Aug 19 02:02 file:///tmp/geotag/zwinger_1.jpg
The basic search syntax is of the form metadata name, comparison operation, and value, all within parentheses. Since libferris implements geotags using its emblem system, searching for files with a geotag
foo uses the
emblem:has-foo metadata name. The “ferris-” prefix is specifically reserved to allow embedded full-text searching and other special interpretations. In the case of the second query, the “ferris-near-” prefix is used to name a geotag to find files whose tags are in the vicinity of the specified location. The last command shown uses the
eaq:// virtual filesystem to create a virtual directory with the results of the query.
Exposing the results of a search as a virtual filesystem allows you to perform searches with programs that do not specifically support them. For example, you can use the above
eaq://(emblem:has-Zwinger==1) filesystem with ferriscp to copy the results of a search to a DVD. The
eaq:// filesystem does not support creating new files, because files in that filesystem must match the query, which is not easy to guarantee for newly created files. Consider an
eaq:// filesystem that has tuples from PostgreSQL and email messages from Evolution in it; what would it mean to create “a new file” in this directory? This read-only distinction is the only major one between an
eaq:// filesystem and any other filesystem in libferris.
By default, the distance units are kilometers, so you can omit the km unit. Libferris 1.1.98 will support the miles unit name as well — for example
(ferris-near-Zwinger<=50miles). You can also use the “d” unit, which interprets the distance as a digital longitude delta. The main use for the “d” unit is if you have a distance as a digital longitude or latitude delta from a mapping program. There are about 111km per unit of digital longitude or latitude.
Queries for geotags can be mixed with other query types. For example, you can query for files that changed within a given date range that have a given geotag, or files that match a full text search and are located within 100Km of a given geotag. The “&” prefix can be thought of as combining all the following subqueries using a set intersection.
$ feaindexquery "(&(mtime>=last month)(emblem:has-foo==1))" $ feaindexquery "(&(ferris-fulltext-search==ticket opera) (emblem:has-Sydney<=100))"
|Click to enlarge|
The figure shows
ego displaying the results of a geotag search as a directory. Right-clicking on any file can run a geospatial query, in this case for image files within 20 kilometers of the Zwinger.
Showing and searching for geotagged files in Google Earth
To open a geotagged file in Google Earth, use the ferris-open-google-earth-for-context.pl Perl script, supplying the file name as the first argument. Once Google Earth opens, it will be centered on a placemark for the file you opened. Clicking on the placemark will allow you to perform a search using libferris for files, images, and videos either with that geotag or with any geotag within a given distance of the current file.
The interface for applications to interact with Google Earth is very coarse. The above mentioned ability to start a filesystem search with Google Earth, shown here, makes use of a special browser script for Google Earth.
Libferris ships with a libferris-googleearth-bouncer.pl script which interprets special URLs as filesystem searches. Such URLs begin with the prefix
file:///LIBFERRIS-SEARCH and trigger a command to be sent to the
ego file manager. To allow users to customize how libferris searches are to be performed, this script is executed from the path ~/bin, which means that you should copy that script from /usr/local/bin to ~/bin.
There is a screen animation at the top of the eye candy section of the libferris Web site that shows how to search for files with Google Earth and libferris.
For more information on indexing with libferris, see my Linux Journal article on indexing. For more information on things that libferris can mount and its command line tools, see a second Linux Journal article. Happy geotagging!