July 19, 2004

Wrangle digital photos with imgSeek

Author: Nathan Willis

Back in the Bad Old Days we kept our photos crammed into shoeboxes in the closet, to be pulled out once every few years for a halfhearted attempt at assembling an album. With the onset of the digital era, that should be a thing of past, right? Yet most of us have simply replaced the shoeboxes with overcrowded folders on our PCs, and because our digital cameras tend to slap on unhelpful names like DCS00032.JPG, we still have to browse through them all manually to find the ones that are of interest to us. But one particularly good open source program, imgSeek, can help you get organized.

You can use imgSeek simply to browse through all the images on your system, but it's when you add images to the program's internal database that the fun really begins. ImgSeek allows you to search through your images based on their content and appearance.

imgSeek is available as a tarball and in RPM and .deb binary formats. It requires that you have QT installed.

Before you start imgSeek, move all of your images into one directory. Some people advise building a nested system of directories to keep different collections of photos together, but that just duplicates the work that the software is there to do for you. Programs like imgSeek are meant to provide a layer of abstraction above the actual files, in much the same way that music library programs do for MP3 collections. You'll be better served by keeping everything in one location in the long run, and enjoy the advantage of having the system alert you to duplicate files.

I also recommend sticking with a simple roll-number-frame-number scheme for file names, and leaving all the sorting and categorizing to the computer. You can use the roll number assigned by the photo lab for all images scanned from film, or just start at 0000001 and assign roll numbers yourself. Get into the habit now, and it will become second nature. ImgSeek has an excellent bulk-renaming wizard under the Tools menu, so you can import your photos first, and then straighten out the names.

Looking through multiple volumes at once - click to enlarge

So, you've gathered all your pictures into one place, and given the file names the once-over; it's time to fire up imgSeek. The first thing you'll see are four tabs: Browse, Add, Search, and Options. Browse, as you might expect, lets you scroll through thumbnails of your pictures. Options is similarly self-explanatory. From the Add tab, you can specify the directory to look under and import your images in bulk, complete with a range of options to help you grab just a few, if you desire.

imgSeek allows you to maintain multiple distinct "volumes" of images, and each volume can have multiple subgroups. When adding images to a volume, click the checkbox to make imgSeek automatically extract image metadata, because that's what you'll be searching on when it comes time to find a photograph. imgSeek keeps track of metadata internally, which means that you can keep your images anywhere, including on a network share that is also accessed by Windows PCs and Macs.

Getting your pictures into imgSeek is no challenge, but of course the real test of an image-management program is how easily it lets you find one. For this we'll visit the Search tab.

Never metadata I didn't like

The Search tab lets you look for images in several fashions -- by content, keyword, and group. The Keyword option lets you search through the images' metadata.

There are two distinct types of metadata tags commonly found in images files. You're probably familiar with Exchangeable Image File Format (EXIF), since many digital cameras and scanners automatically tag images with it. EXIF tags hold data like file creation date, colorspace, shutter speed, aperture, and horizontal and vertical dimensions -- all of which are attributes that can be determined automatically by a hardware device.

What's more interesting to us humans, however, is the information for photographer credit, copyright reference, location information, caption, and an extensible set of keywords. EXIF tags only allow you a single "comment" field to hold all of this semantic information. The International Press Telecommunications Council's (IPTC) Image Interchange Model (IIM) does a better job of holding these tags. Someone has to assign this metadata, but that's a worthwhile exercise: If you're using this software at the office, your boss is a lot more likely to say "pull up all of our photos of Tony in Madrid and let me pick one out for this report" than he is to ask for all of your 2360x3678 photos taken at 1/250 second.

imgSeek reads and indexes both EXIF and IPTC metadata tags - click to enlarge

imgSeek imports both kinds of tags into its internal database, allowing you to build compound queries on the metadata. Currently, imgSeek is one of only a handful of Linux desktop apps that can read IPTC metadata; if you know of another program that can read, or more importantly write, IPTC data, please drop me a line and tell me about it. Currently, imgSeek does not write new tags out to the image files; it just indexes the information they already contain. Program author Ricardo Niederberger Cabral says tag writing on his to-do list for future releases.

Tell me what I'm looking at

What if you don't remember the location or the date of the picture that you're looking for, but you can picture it clear as day? imgSeek can help you.

When imgSeek added your pictures to its collection, it extracted the metadata tags and cached thumbnails for easy viewing, but it also calculated the wavelet transform of the image, which is sort of a hash of the visual information in the picture. And that gives imgSeek the unique ability to search through files based on their appearance.

For instance, you can select any image in the browser, right-click on it, and from the context menu choose "Query for similar images." This query will return every image in your collection with a similar wavelet transform, to an adjustable threshold level.

Rough-sketch a picture and the 'Search by Image Content' pane will find it - click to enlarge

You can find that pesky image that you remember, even though you have no idea what or where it is. Next to the Keyword search tab we've been using is the tab labeled "by Image content." Here you can make a sketch based on memory in a rectangular drawing area, and imgSeek will populate a list of the best possible matches from your image collection, dynamically updating it as you continue to draw. You can also adjust the fuzziness of this search to tweak how many matches it shows you. You don't have a full GIMP's worth of drawing tools, but you're not here to create a masterpiece, you're just trying to find one you already made.

This is a feature shared by no other image management program I have seen, and it is unbelievably useful. According to Niederberger, it was this idea that inspired the project.

A word of caution - the drawing area tool is pickier about color than it is about shape. A couple of times, I've looked for an image with the drawing area and have been frustrated by my inability to find it, only to discover that my recollection of the colors was a bit off.

This isn't strictly speaking imgSeek's fault; our memory for color is imperfect, and there are multiple technical hurdles (such as differing color spaces) to contend with. When I asked Niederberger about this issue, he said that adjusting the sensitivity to color was no problem. It's nice to know the next release might resolve my woes.

imgSeek's has more abilities than I have room to cover here. Download a copy and start right-clicking, and you will find that the author has thought of nearly everything you would want to do from a context menu. Look through the help system for more on the program's batch processing capabilities.

If you've ever been muddled or confused by your ever-growing collection of digital photos (at home or at work), you will find imgSeek a valuable tool.

