April 28, 2004

Quick and dirty typesetting with APT

Author: Scott Nesbitt

If you need a markup language to create nicely formatted documents, Linux has plenty of them to choose from -- DocBook, TeX and LaTeX, Lout, the roff family, and of course (X)HTML and XML. So do we really need another? I didn't think so, until I ran across Almost Plain Text (APT), a simple system for marking up text in which most of the formatting is done using indentation and ordinary keyboard characters. Using APT's command-line formatting engine, you can output APT documents to PostScript, PDF, LaTeX, and HTML.

APT is the closest you can get to having a markup language without markup. While other languages use explicit commands to format text, APT instead uses spaces, indents, and keyboard symbols. This pseudo-markup denotes headings, body text, lists, tables, links, and font formatting. For example, in APT a document's section headings are always left-justified. You can specify up to five levels of section headings using asterisks. Body text needs to be indented at least four lines, and lists are indented and denoted by an asterisk or number.

If APT is so limited, why use it? (For that matter, why would someone want to use a markup language instead of a WYSIWYG application?) The best reason is because APT is simple -- simple to learn, simple to use, and the converter is simple to run. You can be up and running with APT within 20 to 30 minutes. It took me that long to figure out how to structure a basic DocBook file. APT produces clean output that can be exchanged and processed by anyone. APT is also a good introduction to markup for Linux newbies.

Preparing a document in APT is easy -- far easier than with any other markup language or even a word processor. You just type text in your favorite editor and add the appropriate indents, spaces, and symbols where needed. You will quickly find that adding the necessary markup is far more intuitive than most other markup languages.

You typeset a document with APT using a utility called aptconvert. aptconvert is a command-line Java tool. The archive containing the software weighs in at less than 1 MB. Compare that to a typical TeX distribution or set of DocBook tools, which are several megabytes in size each.

Aptconvert transforms your source file to HTML, Rich Text Format (RTF), LaTeX, Postscript, PDF, or DocBook XML. With LaTeX, you can have aptconvert output either a LaTeX file or a compiled LaTeX DVI file. Of all of supported output formats, HTML has the most output options. You can, for example, convert your APT document to HTML 4.01 or XHTML, and you can split a long HTML file up into multiple pages complete with navigation icons.

Running aptcovert at the command line can be a bit daunting. There are a number of options for each output format. Here's an example of an aptcovert command line to output an RTF file:

aptconvert -pi rtf topmargin 2.5 -pi rtf bottommargin 2.5 -pi rtf
leftmargin 2.5 -pi rtf rightmargin 2.5 -pi rtf fontsize 11 -pi rtf spacing 12
myDoc.rtf myDoc.apt

It can be difficult to remember these options, and continually typing them at the command line quickly becomes tedious. You're better off encapsulating the options that you use for each format in a script.

While investigating APT, I authored about a dozen documents and converted them to all the supported formats. I found that, overall, the quality of the output produced by the aptconvert utility is quite high. The HTML, for example, validates using HTML Tidy. I was able to transform the DocBook files using my DocBook tools without any problems. And the RTF files open in applications like StarOffice, OpenOffice.org, AbiWord, and ApplixWare with little or no loss in formatting.

I didn't like the LaTeX source files that aptconvert generated, however. Instead of using standard LaTeX markup, aptconvert creates a number of new commands and environments to simulate normal LaTeX -- for example \begin{plist}
instead of \begin{itemize} to denote the start of a bulleted list
While this is perfectly acceptable, it also increases the size of the source file and makes it difficult to edit the LaTeX file later on.

The default look of the hard-copy output is based on the standard LaTeX article class, which is traditionally used for short documents like submissions to scholarly journals. A LaTeX article doesn't have a particularly dynamic layout; it uses one-inch margins all around and fully-justified text.

While the layout of a typeset APT document is simple, it is also effective. The
look and feel of a document generated from an APT source file is more than adequate for most purposes. However, there is generally not a lot you can do with the output from APT. The exceptions are HTML and LaTeX documents. If you generate an HTML file, you can use a Cascading Style Sheet to define the look and feel of the document. If you're familiar with LaTeX, you can specify a different document class to use, as well as an alternative font package. Instead of the article class that APT defaults to you can, for example, use the hc, hitec, or refman document classes with Palatino or Utopia fonts to change the look of your documents.


APT is far from perfect. Simplicity is APT's strength, but it's also APT's biggest drawback. When you output to PDF, DVI, Postscript, and RTF you're stuck with the overall look and feel of a LaTeX article. For many people, this look and feel are fine, but some people I know hate the appearance of all LaTeX documents and refuse read them.

APT is best suited for creating short articles or reports. For anything longer than 15 or 20 pages, I suggest using LaTeX, DocBook, or a word processor. While you can add images to an APT document, APT lacks support for equations.

While the markup is easy to use and learn, creating tables is a bit of a chore. In APT, tables are constructed using asterisks, dashes, and pipes. You have to experiment with the width of the table cells to get them right. On top of that, you do not have much control over the formatting of table. Depending upon the output format you chose, the cells may not be the correct width or may appear quite cluttered.

An explanation of each APT markup element is beyond the scope of this article. However, the APT user guide describes the markup in fairly good detail. I have also written an APT quick reference guide.


I'm something of a LaTeX and DocBook fanatic, and I cannot see
anyone like me adopting APT as their main markup language. However, if you
want to quickly put together a document and do not want to fiddle with a
higher-level markup language or deal with a word processor, APT can't be
beaten. APT is truly quick and dirty typesetting, and it does a great
job at it.

Scott Nesbitt is a writer based in Toronto, Canada, who spends too much time playing with markup languages.

Click Here!