July 14, 2008

Two handy MediaWiki extensions

Author: Peder Halseide

Here are two powerful tools for your MediaWiki installation. One helps you populate your wiki quickly from data in a spreadsheet. The other creates PDF ebooks, complete with tables of contents and page numbers, with a single click from your wiki.

Extension series

MediaWiki, the open source software behind sites such as Wikipedia.com, is not just a wiki, but a complete content management system for Web sites and intranets. But if you have installed MediaWiki, you are probably familiar with the challenge of importing content from non-MediaWiki sources. A GPL-licensed Perl script called csv2wiki can help you convert or upload massive amounts of content into your wiki. To work with csv2wiki, you need to have Perl installed (Windows users can use Active Perl.) Csv2wiki uses another Perl script called wiki.pl to log in and edit wiki articles.

To create content with csv2wiki, you need to do two things. First, prepare your content in spreadsheet format, with appropriate column headers. Second, create a template to tell your wiki article how to lay out and format the content. The column headers in the spreadsheet will correspond 1:1 with your wiki template.

For example, if you are importing a dictionary with pictures, your spreadsheet (let's call it tasty.csv) might look like this:

Term Definition Image Tag
Apple A delicious fruit. apple.png Fruit
Walnut A tasty nut. walnut.jpg Nut
Asparagus A tasty vegetable. asparagus.gif Vegetable

Your wiki template (call it Template:Tasty) might look like this:

===Definition and Picture===

The final part of the wiki2csv tool is the job file, which contains all the information necessary to update your wiki from content in a single CSV file. Fields in the job file include the source file, the wiki URL, username, and password, the separator (defaults to commas, but you can use a tab or any regular expression), the title of the article, and the template name. One tricky thing is that the title of the article in the example above would be the value 0, as opposed to 1, even though it is the first column in the spreadsheet. To import the example CSV file above, the job file (let's call it tasty.txt) would look like this:

csv: tasty.csv
wiki: http://example.com/wiki/index.php
user: username
pass: password
title: 0
template: Tasty

To run this import, run the command perl csv2wiki.pl tasty.txt

Using this tool, you can quickly import content from a variety of sources, provided you have converted it into a CSV file first. In my testing, csv2wiki upload more than 4,000 articles from a spreadsheet containing more than 20 columns in about an hour.

csv2wiki is available only from its author at Organicdesign.co.nz, or from WikiExpert.com. Due to the power of this tool, you must provide a compelling case in order to get a user ID that lets you download the script, but once you do, it's free. Full documentation and support is available to those willing to pay for it.


Our second tool is Pdf_Book, a MediaWiki extension that can create a PDF book, complete with an automatically generated table of contents and page numbers, from a category or list of articles on your wiki. If you want to create a PDF from only a single article, you should use the extension PDF Export instead, which appears as a link in your wiki's toolbox for easy conversion of any given article.

Pdf_Book will include images embedded in your articles, and allows markup to force custom pagebreaks for you. On the down side, it is a fairly complex extension to install and configure, but once you are done, the payoff makes it worth your time. To use either the Pdf_book or PDF_Export extension on your wiki, you first need to install Htmldoc, which does all the grunt work of converting an HTML document into PDF. You may also need to increase the amount of RAM allocated to PHP in your php.ini file; the extension's author recommends at least 64MB. Finally, be aware that the extension does not create content from subcategories. Deeper levels are currently done from heading levels within articles only.

Once you have the extension installed, you can create a PDF book from any category simply by appending &action=pdfbook to the end of the URL of that category: for example, http://www.example.com/wiki/index.php?title=Category:Test&action=pdfbook. You can do this manually, or you can create a template that generates the link, and include that template in your category pages to give users a link to download the contents of an entire category as a PDF ebook with a single click.

With csv2wiki and Pdf_Book, your wiki can quickly grow in size, and you can impress your users (or your boss) with powerful features that let you create manuals or ebooks from content that needs to be downloaded or distributed as an ebook or PDF file.

Every Monday we highlight a different extension, plugin, or add-on. Write an article of less than 1,000 words telling us about one that you use and how it makes your work easier, along with tips for getting the most out of it. If we publish it, we'll pay you $100. (Send us a query first to be sure we haven't already published a story on your chosen topic recently or have one in hand.)


  • Enterprise Applications
  • Tools & Utilities