November 28, 2006

Linux perks up the Espresso Book Machine

Author: Michael Stutz

When he was an editor in the 1950s, Jason Epstein made the paperback book ubiquitous. Now he's about to do the same for a Linux-powered printing press.

In the book business, the great expense is the supply-chain logistics: printing books, shipping them, warehousing them, sending batches to bookstores and on to customers -- every step along the way extracts a fee. Epstein's new company On Demand Books is using Linux to develop the Espresso Book Machine (EBM), a $100,000 device that takes distance out of the equation by printing the books right at the point of sale.

"The design philosophy for the machine is that it's fully automatic -- you literally push a button on a Web browser and five minutes later a book pops out," says On Demand Books CTO Thor Sigvaldason. The machine takes a book stored as a PDF file -- which may eventually come to mean anything that you can find on Google -- and prints, trims, and binds it into a perfect-bound paperback book in about five minutes.

Today two working EBM 1.0 prototypes exist. The first is at the InfoShop bookstore of the World Bank in Washington, DC, and the second has just been installed at the Bibliotheca Alexandrina in Egypt.

These beta models are hybrids of Linux and proprietary software, including a PC/104 embedded computer running Windows 2000. But in the first production model -- the EBM 1.5 that's set to debut at the New York Public Library this winter -- the embedded Windows computer is coming out.

"The only reason that thing exists is to talk to all of the servo motors and the hydraulics and pneumatics and compressors -- all the little activators and switches and everything else that manipulates the paper through the thing and applies glue and everything else," Sigvaldason says. "Most of those things in industrial applications expect a serial bus to PC/104 daughter card. It's really hard to find real industrial embedded systems that aren't severely wedded to a Windows-on-PC/104 architecture."

The solution he found is to use a product such as an Advantech ADAM that he can talk to from Linux with what's essentially a Telnet protocol.

"It's a little box about the size of an iPod and in one end plugs an Ethernet cable and on the other end there are eight terminal screws where you can do analog I/O or digital I/O," he says. "You can turn switches on and off and you can read sensors. And they just talk in TCP/IP protocol on the inside, so the whole computer subpanel gets all ripped up, replaced with about eight of those. And then it's all TCP/IP."

Sigvaldason says that the EBM's user interface is completely LAMP-based. "It's Apache serving up PHP pages that are talking to a MySQL database." A core background daemon, written in C++, handles the operations: it asks the machine if it's ready for print jobs and queries the orders database to see if anyone has asked for a book to be made. The daemon also handles calls to gs to resize and convert the content as needed. "We're using a lot of PSutils -- basically pstops for pagination routines, and lots of custom Ghostscript," he says.

Eventually the machines will be networked, so that they can share content between them; the protocol they wrote in-house to do it is loosely based on DAAP (Digital Audio Access Protocol), which Apple's iTunes uses for content sharing.

"And then the next extension from that is talking to other collections of content, which of course we would be insane to design as some unbelievably proprietary, closed protocol," Sigvaldason says. "We obviously want any collection of content to be able to talk to our API or us to talk to their API in as open a way as possible, because the more content, the more there is use for these machines. The only caveat in there is that once you start talking about in-copyright content, then we need a PKI [public key infrastructure] so that we can insure the person delivering is who they say they are and, you know, actually may have some rights to that content."

Despite the usual glitches to iron out, the company has experienced no major difficulties in its use of FOSS.

"We never had any problems down to the point where we had to go to the Ghostscript source code or go to pstops source code and fix it, but there were little things," Sigvaldason says. In Egypt, for example, the version of Ghostscript they use on the EBM 1.0 gave them problems generating duplex pages, so he ultimately had to install Ghostscript 7. "You take the exact same chain of flags and steps and everything and you use Ghostscript 7 and it works fine," he says. "Same driver, exact same flags, same printer, same PPD [PostScript Printer Description], same CUPS daemon actually sending the print job, [and it] would duplex fine. So to have a working version, we have to have two copies of Ghostscript."

There were also problems in dealing with paper size defaults, which has been a common setup problem on Linux systems in general.

"The most commonly distributed version of pstops always seems to think about the world in A4 pages," says Sigvaldason. "When we were doing pagination for letter-size pages, we always had to add a step where we paginate with pstops, and we really end up with an A4 page, but then use Ghostscript with a forced media flag to munge it back into letter."

Sigvaldason says that he chose Linux as the primary development and implementation platform out of familiarity.

"But having said that," he adds, "these things have to operate headless -- the user just pushes a button and a book comes out. And trying to do that kind of headless, server-side PDF or PostScript manipulation on any other [operating system] -- even on a Mac, let alone a Windows box -- you'd end up with Ghostscript anyway, so it was kind of a no-brainer.

"In fact, I'm constantly amazed when I go to trade shows," he adds. "We were at Graph Expo in Chicago a month and a half ago, which is like the printing technology trade show. Even among all the hardware exhibitors there are all these little booths, all this commercial software for manipulating PDFs! It's like its own niche industry, and I felt like running through the aisles screaming, 'Just use Ghostscript!' You might actually have to pay attention to the manual page for a couple of hours to understand what you can do, but there's this huge industry around PDFs which doesn't make any sense to me."

Click Here!