April 29, 2008

Producing visually pleasant documents from plain text with reStructuredText and rst2a

Author: Nikos Kouremenos

reStructuredText is a lightweight markup language intended to be highly readable in the source format. With it, you can produce beautiful HTML, PDF, XML, and even S5 documents from plain text files.

reStucturedText is a part of Docutils, an open source text processing system for processing plain text documentation into more useful formats. Docutils is written in Python, and you will find a package for it in most Linux distributions, though you can install it from source under Linux and Microsoft Windows.

reStructuredText was first designed to be a markup syntax for use in Python docstrings and other documentation domains. The goal was to be readable and simple, yet powerful enough for non-trivial use. It has, since then, evolved to be a more general-purpose markup language. It ships with scripts to assist the conversion from source to other formats, including:

  • rst2html.py, for exporting to an HTML document which you can put online;
  • rst2latex.py, for exporting to LaTeX, which later you can use to produce a PDF and PostScript file;
  • rst2s5.py, for exporting to S5, which you can use to produce a Web-based presentation; and
  • rst2xml.py, for exporting to Docutils-native XML output, which later can be transformed with standard XML tools such as XSLT processors into arbitrary final forms.

Running any of these scripts is simple. Each requires a specially formatted text file as input. Let's examine an example file we'll call demo.txt:

=================
Title
=================

Section
=================

* this is a list item
* a list item can span more than one line -
just align the continuing text after the
bullet and white space
* another list item

To process this document, run the command rst2html.py demo.txtdemo.html. The produced HTML code (here, without the embedded CSS, so we can focus on the HTML code) looks like this:

<div class="document" id="title">
<h1 class="title">Title</h1>
<h2 class="subtitle" id="section">Section</h2>
<ul class="simple">
<li>this is a list item</li>
<li>a list item can span more than one line -
just align the continuing text after the
bullet and whitespace</li>
<li>another list item</li>
</ul>
</div>

All the scripts accept extra parameters that let you customize to a great extent the produced document. When I wanted to produce a S5 slideshow, I was pleased to find that the rst2s5.py script allows you to use other than the default themes, so the final result can meet your aesthetic needs too.

To produce a PDF document you have to first produce a LaTeX file, then pass it as input to LaTeX, which will produce a PS and a PDF file. In other words, this is not straightforward. You can either write a shell script to do the post-processing for you, or you can use an external script called rst2pdf.py that will directly convert your reStructuredText file to PDF using ReportLab, an open source PDF library written in Python. Rst2pdf is under everyday development and works well, so expect it soon to ship with the other official scripts.

You might also find rst2odt.py useful for Open Document Text file generation. I didn't test this one, but it seems to be under active development.

Using rst2a.com

It's nice to have your own scripts and customized parameters to produce output that pleases you, but what if you want to quickly produce a PDF or HTML document from a reStructuredText file and you left your scripts at home? For this situation the creators of reStructuredText recently created rst2a.com, which lets you either enter text in the reStructuredText format or upload a reStructuredText file and choose which output format you want and which style you like, and produces a file for you in a few seconds.

Rst2a.com's the main goal is to lower the barrier of entry for using reStructuredText. If you choose to create a new reStructuredText on the fly, you can find a legend on the right of the page that will assist you in learning and remembering the most common syntax; for example, the syntax for making text bold or italic. The legend is handy for starters, but I would like to see a more interactive AJAX editor that would act as a simple front end and produce reStructuredText for you. Rst2a.com also provides a rendering API web service, which makes adding support for reStructuredText to other Web applications simple.

reStructuredText is a great way for producing visually pleasing HTML and other documents from simple plain text files. It's trivial to use for simple tasks, but for more complex tasks it may require users to dig into the documentation or look for examples. Indeed, when first learning to use the tool users may find themselves investing more time in document creation than they would have invested had they used a WYSIWYG editor. However, once learned, the advantages of writing documents in plain text with the ability to produce documents in a wide variety of formats from the same source become clear.

Category:

  • Tools & Utilities
Click Here!