September 27, 2007

Export Writer documents into any wiki format

Author: Dmitri Popov

One of the most welcome additions to OpenOffice.org 2.3 is a new export filter that allows you to save Writer documents as MediaWiki-formatted pages. That's all fine and dandy if you are using MediaWiki, but what about other wiki systems? The answer to this question comes in the form of the OpenOffice2UniWakka export filter. While it's designed to work with the UniWakka wiki, with a bit of hacking you can adapt it to other wiki systems as well, even if you are not familiar with XML and XSLT.

The OpenOffice2UniWakka filter itself consists of the OD2UniWakka.xslt file (or OOo2UniWakka.xslt, if you are still using OpenOffice.org 1.x.x). You can think of this file as a simple conversion program written in XSLT (Extensible Stylesheet Language Transformations), "an XML-based language used for the transformation of XML documents into other XML or human-readable documents." Since the content of an .odt document is stored as an XML file, XSLT is a perfect tool for transforming Writer documents into wiki-formatted text files.

To start hacking the OpenOffice2UniWakka filter, download its latest version from UniWakka's main page and unpack it, then open the OD2UniWakka.xslt file in a text editor. Let's start seeing how we can modify the output with something simple like specifying transformation rules for basic formatting, including bold, italic, and underlined. The filter defines three variables that describe each style: font-style, font-weight, and font-underline:

<xsl:variable name="font-style" select="//office:automatic-styles/style:style[@style:name=$style]/style:text-properties/@fo:font-style"/>
<xsl:variable name="font-weight" select="//office:automatic-styles/style:style[@style:name=$style]/style:text-properties/@fo:font-weight"/>
<xsl:variable name="font-underline" select="//office:automatic-styles/style:style[@style:name=$style]/style:text-properties/@style:text-underline-style"/>

It then finds all text fragments containing the specified formatting and wraps them in wiki markup:

<xsl:when test="$font-weight='bold' and $font-style='italic' and $font-underline='none'">**//</xsl:when>

This means that all you have to do is to locate all the statements that start with <xsl:when test= followed by the "style" condition, and replace the existing markup with the one you want.

The XSLT file also contains a separate section called "my styles" (it starts with the <--! my styles --> comment) where you can specify custom transformation rules. If you take a closer look at the default transformation rules, you'll quickly notice that each rule has the following structure:

<xsl:template match="//text:span[@text:style-name='emph']">
   <xsl:text disable-output-escaping="yes">**</xsl:text>
   <xsl:apply-templates/>
   <xsl:text disable-output-escaping="yes">**</xsl:text>
</xsl:template>

You don't need a degree in rocket science to figure out how the rule works: it simply finds text with a specific style and wraps it in the appropriate wiki markup, so the sentence "This is <b>bold</b>" is converted into "This is **bold**." Knowing that, you can quickly add your own rules. If you are using, for example, DokuWiki, you can create a Writer template containing custom styles like dw-bold, dw-italic, dw-underlined, and so on. You can then specify transformation rules for each style in the "my styles" section of the OD2UniWakka.xslt file:

<xsl:template match="//text:span[@text:style-name='dw-italic']">
   <xsl:text disable-output-escaping="yes">''</xsl:text>
   <xsl:apply-templates/>
   <xsl:text disable-output-escaping="yes">''</xsl:text>
</xsl:template>
<xsl:template match="//text:span[@text:style-name='dw-bold']">
   <xsl:text disable-output-escaping="yes">**</xsl:text>
   <xsl:apply-templates/>
   <xsl:text disable-output-escaping="yes">**</xsl:text>
</xsl:template>

To make use of the defined custom transformation rules, you need to create a Writer template containing the specified styles (i.e. dw-italic, dw-bold, dw-underlined, etc.) and use them to format text.

Once you've figured out the basics, you can tweak the rest of the OD2UniWakka.xslt file. The final step of the process is to add the modified OD2UniWakka filter to OpenOffice.org. To do this, choose Tools -> XML Filter Settings and press the New button. Enter the filter's name in the Filter name field (e.g. "OD2Wiki") and select OpenOffice.org Writer from the Application drop-down list. In the Name of file type field, enter the name of the filter as it will appear in the list of available formats (e.g. "wiki") and enter "txt" in the File extension field. Switch to the Transformation tab, and enter the path to the OD2UniWakka.xslt file in the XSLT for Export field. Press OK, then Close, and the filter is ready to go.

If you want to add the filter to multiple OpenOffice.org installations, or you want to share it with other users, you can export it as a .jar package. To do this, choose Tools -> XML Filter Settings, select the filter, and press the Save as Package button. You and other users can then install the package by simply pressing the Open Package button.

Categories:

  • Office Software
  • Tools & Utilities