February 15, 2008

Free software and 2-D barcodes

Author: Nathan Willis

You've probably seen them: black and white checkerboard-patterned matrices printed on labels and other real-world objects that you can optically scan with a cameraphone or other handheld device to extract an encoded message. But although 2-D barcodes (as they are known) are getting more common, working with them is still a bumpy road for the Linux and free software crowd. Fortunately, several options exist for reading and generating them with open source software.

Unfortunately, there are dozens of 2-D barcode formats in the wild. From a technical standpoint, they fall into two general groups: stacked linear and matrix. Stacked linear codes are like traditional 1-D barcodes (a la UPC codes on retail box packaging), but with multiple lines of information. Visually, the black and white lines are elongated, so they look similar to 1-D barcodes. Matrix codes, on the other hand, tend to use square or circular dots instead of lines to encode their information. The most common matrix formats use square grids containing black and white squares, but there are plenty of variations -- including hexagonal, angular, and multicolor patterns.

A feature common in most matrix code formats is orientation-independence: the matrix contains markers that code-reading software can use to determine which direction in the scanned image is up. On the downside, reading a matrix code usually requires a camera that can capture the entire matrix at once. In contrast, many stacked linear codes can be read one line at a time, with cheaper laser scanners.

The barcode entry at Wikipedia contains a list of many current and historical 2-D barcode formats, but is light on detail. A better in-depth guide is available from Russ Adams, and includes important information such as the patent status and openness of each format. The majority of formats are proprietary and under patent control of a single company. In many cases, the company in question sells an all-in-one inventory tracking, watermarking, or data archiving solution based around the format. Xerox's Dataglyph, for example, is designed to blend into the background of a printed document, to facilitate hiding tracking codes in direct marketing materials and "anonymous" surveys.

Several of the oldest codes have outlived their patent terms, including the stacked linear Code 16K and Code 49, and UPS's matrix code MaxiCode (recognizable by the circular "bullseye" pattern in the center). Others, such as Aztec Code (featuring a square "bullseye" in the center) and the newer Supercode and Ultracode formats, are patented, but have published specifications that you can purchase from various standards organizations, most notably the Association for Automatic Identification and Mobility (AIM).

Since the standards approval process is often long and intellectual property requirements vary from organization to organization, whether or not you can implement one of these patented standards without purchasing a patent license varies from case to case.

Fortunately, the three most common formats have simpler terms of use, and they are better supported by free software. PDF417 is the granddaddy of the bunch -- a stacked linear format that you will find on many shipping labels, including the US Postal Service and FedEx. Despite the similarity in names, PDF417 is not related to Adobe Portable Document Format. A related standard, MicroPDF417, is a simplification of the format that uses a reduced character set and can be used to generate smaller codes when space is at a premium. In 2004, PDF417's creator Symbol Technologies issued a press release denoting the standard as public domain.

The formats you actually care about

As popular as PDF417 is for industrial usage, the trendiest formats today are without a doubt Data Matrix (DM) and Quick Response (QR) Code; if you see a 2-D barcode on an advertisement, beckoning you to snap a picture of it with your cameraphone, chances are it is one of these two formats.

Both DM and QR Code are black and white matrix codes that appear in square arrays, and use square tiles to encode information. You can distinguish between them by their orientation patterns. DM blocks have a solid black border on two adjacent sides (forming an "L"), and an alternating black and white border on the other two sides. QR Code arrays have no border, but feature prominent square bullseyes, one in each of three corners.

DM was created by a subsidiary of engineering giant Siemens, and is commonly used to mark integrated circuits and other small electronic components. The original patent on the format has expired. QR Code was created and patented by Japanese firm Denso Wave, which publicly advertises that it will not exercise its patent rights. Originally used in manufacturing, QR Code is popular in Japan, and is making its way around the globe.

Since both DM and QR Code are matrix formats, they are easy to read with increasingly ubiquitous cameraphones, and their compact size makes it easy to encode URLs and email addresses that can be handed off automatically to a mobile browser or mail client.

Like most formats, DM and QR Code arrays can vary in size depending on the size of the encoded content. Both are very compact. While a PDF417 symbol can encode around 1,800 alphanumeric characters, a DM symbol can encode 2,335, and a QR Code symbol up to 4,296. They also have flexible error-correction available, which can make the resulting symbols more resistant to scratches, marks, and smudges on the printed page.

Makin' codes

How can 2-D barcodes help you? Well, suppose you want to generate a code that translates to a link to your Web site, for example, that you can then paint on your roof or stitch into your T-shirt. The simplest way to do so is with one of the plentiful free online utilities. Many are provided by companies like Kaywa and Nokia that are interested in propagating cellphone-readable codes, but others, like invx.com and the Barcode Writer in Pure PostScript, are noncommercial.

The majority of the Web alternatives focus exclusively on DM and QR Code, although a few branch out. For example, both the Pure PostScript Writer and Tec-it can produce MaxiCode, which is a relative rarity. If you try more than one symbol generator, you will soon discover that the same string can be encoded in several different ways even with a single format, thanks to the variety of error-correction options.

If you find yourself generating new codes regularly, you will appreciate that GUI apps for desktop Linux are beginning to implement barcode-generation features. Barcode4J is an open source Java app that generates both 1-D and 2-D barcodes, including PDF417 and DM. Jaxo provides a free-but-closed-source Java app that can generate DM, QR Code, PDF417, and Aztec Code symbols, and provide control over many of the formats' optional features. Snapmaze.com offers a Firefox plugin that instantly generates QR Codes for the page you are looking at. And both Scribus and KBarcode now include wrappers around the Pure PostScript barcode writer, and can output DM and QR Code arrays suitable for printing.

On the command line, there are still other options, but most of the active open source projects in this realm are libraries. Some include simple CLI tools, but none give you access to code generation not already available through the graphical programs already mentioned. Still, if you are a developer and you are looking to incorporate barcode generation into your app, check out libdmtx, pdf417_encode, and qrencode for DM, PDF417, and QR Code, respectively, and Zint for a library that tackles all three and more.

Readin' codes

While there are barcode generating options to spare, the free software scene for barcode recognition is not quite as rosy. For starters, most existing DM and QR Code scanning applications are built for mobile phones, simply because barcodes exist to encode information for offline distribution.

If you have a Symbian device or another Java Micro Edition (J2ME)-capable phone, you can get a free (but closed source) barcode scanner without looking far. Many of the same companies that provide QR Code and DM generation Web pages also release their own QR Code and DM reader apps for mobile phones at no charge. Between the free readers from Kaywa, Nokia, UpCode, i-nigma, and NeoReader, you should be able to find support for almost any J2ME phone. But make sure that your reader supports DM and QR Code, and that it does not restrict what you can do with the decoded content; some companies provide free reader software that can only be used to hook back into their own, closed networks.

A few open source J2ME reader projects have appeared in recent years, but did not survive long enough to release any working packages -- dmsymbian and pda-barcode, for example. QRcode (not to be confused with qrencode) is still active but is limited to QR Code symbols only. Its "reader" application is provided only as an example, and support may be tricky if you don't understand Japanese.

The most promising development at present is ZXing, an open source, Java-based code reader that has a working release available. ZXing's Get The Reader page has links to the downloadable reader package -- including a mobile Web link that can download the reader directly to your phone -- and lists the models supported by the current release. Most are Sony Ericsson and Nokia models, and require a phone that can run unsigned applications.

ZXing's Java reader will run on desktop Linux, although it is not yet a fleshed-out user-friendly app. As was the case with generating DM and QR Code, if you have small volumes of barcodes you need to read, the simplest route is to use one of the many free Web apps.

More problematic is that presently there is not a free barcode reading application for any of the handheld Linux platforms -- OpenMoko, Maemo, GPE, or the various Linux-based commercial phones. When these platforms gain a robust Java stack, they will suddenly gain access to all of the J2ME-based barcode readers available for proprietary phone platforms -- even if they are not officially supported. The ZXing project says it is working on a build for Google's proposed Android phone platform (which will have integrated Java), but of course there is no telling when Android-based handsets will come to market, nor how much of Android will make it into other mobile Linux software stacks.

The road ahead

Reports from back East say that QR Code has exploded in popularity in Japan, thanks to its easy integration with advertising. Consequently, 2-D barcodes may similarly take off in the rest of the world in the coming months and years. But Japan's mobile device culture is more advanced than that of many other parts of the globe (e.g., micropayments at vending machines), and those technologies have yet to break into the mainstream in other markets.

The good news is that the DM and QR Code formats are public, and that open source projects like ZXing and Barcode4J can use them. 2-D barcode usage may be a tricky proposition for Linux and free software users today, but if the codes grow into a truly mainstream technology, all the tools and know-how to take advantage of them are in place.


  • Desktop Software