September 16, 2004

Formatting documents with OpenOffice.org Writer macros

Author: Michał Kosmulski

Around the time OpenOffice.org 1.1 RC was released, I was migrating a small company from Corel WordPerfect to OpenOffice.org. OpenOffice.org by itself does not support reading or writing WordPerfect files, but a tool called wpd2sxw can convert WordPerfect files to OpenOffice.org format (SXW). After conversion with wpd2sxw, which was rather good but had problems with some formatting features, I applied macros to documents based on different templates to make more than 2,000 converted documents look very similar
to original WordPerfect files they were generated from. This article presents some macro "building blocks" you can use to modify a document's formatting or to generate well-formatted documents from plain text files.

OpenOffice.org uses StarBasic as its macro language. I won't go into details of the language itself, but the examples given here should be easy to understand if you have some programming experience. You can find tutorials and general information regarding StarBasic macros, along with information on OpenOffice.org scripting, elsewhere on the Web.

You can edit and run macros in OpenOffice.org through
the Macro dialog box (Tools->Macros->Macro...), which is more or less self-explanatory.
For brevity, the macros listed in this article usually don't declare variables they
use, so they won't work with Option Explicit. Generally, they were designed
for OpenOffice.org 1.1, but most also work in 1.0 and should work with
different versions of StarOffice as well.

Changing page size

When I converted my WordPerfect documents, wpd2sxw failed to save the page size in the newly created OpenOffice.org files. Converted documents all defaulted to Letter, while A4
was used in the original files. This macro sets a document's default page size
to A4 (210x297 mm):

Sub SetPageSizeA4(optional doc)
	oDoc = IIf(IsMissing(doc), ThisComponent, doc)
	oStyle = oDoc.StyleFamilies.getByName("PageStyles").getByName("Default")
	' units of 1/1000 cm
	oStyle.Width = 21000
	oStyle.Height = 29700
End Sub

Simple as it is, this example demonstrates the extensive use of styles in OpenOffice.org.
Here, setting a page's property is accomplished by finding the corresponding style
object and modifying that style's properties.

Changing page and paragraph margin size

WordPerfect handles margins in a slightly different way than OpenOffice.org does. In
OpenOffice, margin sizes are a page property -- paragraph indentation has to be
used in order to have paragraphs with different spacing from the left page edge.
In contrast, WordPerfect assigns margin sizes to paragraphs -- there are no
separate entities for page and paragraph margins. Thus, for example,
the effect of having a paragraph start 4 centimeters from the left page edge can be
represented in exactly one way in WordPerfect (margin = 4 cm for that
paragraph), but in many ways in OpenOffice.org (e.g. 2 cm page margin + 2 cm
paragraph margin or any other combination which sums up to 4 cm).

wpd2sxw chose to use both margin types for converted documents. Even though all
paragraphs had exactly the same left margin sizes (4 cm), in converted documents page
margins stayed at the default 2 cm while the other 2 cm were paragraph margins.
Of course, I wanted to have the more natural 4 cm page margins and no additional
paragraph margins. This macro gets paragraph margin sizes from the first
paragraph on the page and makes paragraph margins equal zero while enlarging page
margins to compensate for that.

An extra quirk is that wpd2sxw assumed that default page margins in
WordPerfect and OpenOffice.org would be the same -- which was not quite right.
WordPerfect's default happened to be 2.5 cm while OpenOffice's default was 2 cm.
I didn't bother to find out how to fix that in the general case; I just added an
extra 0.5 cm to the result in order to get the same margin sizes in converted
documents as in the original files (but I removed that code from the macro shown here).

Upper and lower margin sizes got lost in the conversion completely, so the best
thing I could do was to set them to values based on left and right margin sizes.

Here's the code:

Sub FixMargins(optional doc)
	oDoc = IIf(IsMissing(doc), ThisComponent, doc)
	oText = oDoc.Text
	' A view cursor object represents the caret visible on screen.
	' In contrast, text cursors are objects used for manipulating text ranges.
	oCursor = oText.createTextCursor()
	' read paragraph margin sizes
	lMarg = oCursor.ParaLeftMargin
	rMarg = oCursor.ParaRightMargin
	' select all and set paragraph margin to 0
	oCursor.gotoStart(false)
	oCursor.gotoEnd(true)
	oCursor.ParaLeftMargin = 0
	oCursor.ParaRightMargin = 0
	' increase page margins to compensate for zeroing paragraph margins
	oStyle = oDoc.StyleFamilies.getByName("PageStyles").getByName("Default")
	oStyle.LeftMargin = oStyle.LeftMargin + lMarg
	oStyle.RightMargin = oStyle.RightMargin + rMarg
	' all information about top and bottom margins was lost, so we just copy
	' left and right margin sizes
	oStyle.TopMargin = oStyle.LeftMargin
	oStyle.BottomMargin = oStyle.RightMargin
End Sub

Setting page headers and footers

Page headers didn't make it into converted documents, either (current versions
of wpd2sxw are capable of converting them correctly, but older versions were
not). The macro below adds a header with some text and a few fields (page number
and total page count) to the page. Adding footers would be very similar.

This macro also shows how to set basic text attributes such as font name and
size or the language for spell-checking and hyphenation.

Sub AddHeader(optional doc)
	oDoc = IIf(IsMissing(doc), ThisComponent, doc)
	' turn on headers for default page style
	oStyle = oDoc.StyleFamilies.getByName("PageStyles").getByName("Default")
	oStyle.HeaderIsOn = True
	oStyle.HeaderIsShared = True
	oStyle.HeaderHeight = 500 ' 0.5 cm - header height
	oStyle.HeaderBodyDistance = 0 ' 0 cm - distance from page text area to header
	' decrease the top margin by 0.5 cm (the header's height)
	oPageStyle = oDoc.StyleFamilies.getByName("PageStyles").getByName("Default")
	oPageStyle.TopMargin = oPageStyle.TopMargin - 500
	' Add some regular text. In OpenOffice.org 1.1, newly created headers have a centered
	' tabstop at the center of header area and a right tab stop at the right margin.
	' In OO 1.0, one would have to set tabstops in the macro,
	' otherwise default tab stops of 1,25 cm would be used. Tab is Chr(09).
	oStyle.HeaderText.SetString("MyCompany header text - left" & Chr(09) _
		& "centered text")
	' add some fields
	oCursor = oStyle.HeaderText.createTextCursor()
	oCursor.GotoEnd(false)
	oStyle.HeaderText.insertString(oCursor, Chr(09) & "Page ", false)
	' insert "page number" field
	oField = oDoc.createInstance("com.sun.star.text.TextField.PageNumber")
	oField.NumberingType = 4 ' magic constant: 4=Arabic numbers
	oField.SubType = 1 ' another magic constant (use current page number)
	oStyle.HeaderText.insertTextContent(oCursor, oField, False)
	' more regular text
	oStyle.HeaderText.insertString(oCursor, " of ", false)
	' insert "total pages" field
	oField = oDoc.createInstance("com.sun.star.text.TextField.PageCount")
	oField.NumberingType = 4 ' as above
	oStyle.HeaderText.insertTextContent(oCursor, oField, False)
	' Now set some formatting (reuse cursor object)
	oCursor.gotoStart(false)
	oCursor.gotoEnd(true)
	oCursor.CharHeight = "6" ' font size
	oCursor.CharFontName = "Times New Roman" ' font name
	' set text locale to "no locale" (empty locale object) in order to turn off
	' any spell checking in the header
	Dim aLocale As New com.sun.star.lang.Locale
	' Uncomment the lines below to use a specific locale, e.g. US English, instead
	' aLocale.Language = "en"
	' aLocale.Country = "US"
	oCursor.CharLocale = aLocale
End Sub

Setting text alignment

While left, right, and central text alignment were translated correctly, justified
paragraphs were converted to left-aligned text. Let's consider an imaginary document
template that consists of a centered header and justified text
below. Let's suppose the document header contains the company's name and address
and its last line is the phone number. The macro should set only the document text's
alignment to justified, while leaving the header as is. It should also be able to
tell where the header ends even if the text contains an extra space or the
phone number has been changed.

One solution is to use a
regular expression
to find the last line of document header
(the one containing the phone number), then loop over all paragraphs below,
setting paragraph alignment to justified:

Sub SetJustified(optional doc)
	oDoc = IIf(IsMissing(doc), ThisComponent, doc)
	' Find the line with phone numbers (last line of header text)
	Descriptor = oDoc.createSearchDescriptor()
	Descriptor.SearchRegularExpression = true
	Descriptor.SearchString = "^(([tT]el|[Ff]ax)[0-9, .:;-()/+]*)+$"
	oCursor = oDoc.FindFirst(Descriptor)
	if isnull(oCursor) then exit sub ' not found
	Dim result as boolean
	result = true
	' loop over all following paragraphs
	while result
		result = oCursor.gotoNextParagraph(false)
		' Magic constant: 2 = justified
		oCursor.ParaAdjust = 2
	wend
End Sub

Setting tab stops

Setting non-standard tab stops is also easy. This macro creates a right tab stop
(with leading dots) and makes it the only tab stop defined for the current paragraph.

Sub SetTabStops(optional doc)
	oDoc = IIf(IsMissing(doc), ThisComponent, doc)
	' create a text cursor (used for manipulating text) at the position of
	' view cursor (the one visible on screen)
	viewCursor = oDoc.currentController.getViewCursor()
	oCursor = oDoc.Text.createTextCursorByRange(viewCursor.getStart())
	' add tab stop
	Dim tabs as new com.sun.star.style.TabStop
	tabs.position = 25000 ' 25 cm from left - more than A4 page's width
	tabs.alignment = 2 ' magic constant: 2 = right tab
	tabs.FillChar = Asc(".")
	' tab stops need to be put in an array - in this case it contains
	' only one element
	oCursor.ParaTabStops = Array(tabs)
End Sub

Adding horizontal lines

In order to add a horizontal line below a paragraph of text, one needs to set
the paragraph's border style for the bottom border to a solid line. That's
exactly what the macro below does to the current paragraph.

Sub AddLine(optional doc)
	oDoc = IIf(IsMissing(doc), ThisComponent, doc)
	' create text cursor as in previous example
	viewCursor = oDoc.currentController.getViewCursor()
	oCursor = oDoc.Text.createTextCursorByRange(viewCursor.getStart())
	' create a line object
	Dim lHor as New com.sun.star.table.BorderLine
	lHor.OuterLineWidth = 35
	lHor.LineDistance = 0
	' and assign it to the paragraph
	oCursor.BottomBorder = lHor
	oCursor.BottomBorderDistance = 0
	oCursor.TopBorderDistance = 0
	oCursor.LeftBorderDistance = 0
	oCursor.RightBorderDistance = 0
End Sub

Modifying multiple documents

Of course, if we want to use macros for fixing documents broken by converting
from another format, we must be able to apply a macro to many
documents easily. The macro below reads a list of file names from a text file
(which can be generated easily using find) and runs a few macros
on each of those files:

Sub ApplyMacroToFiles()
	Dim fileName as String
	fileName = "/home/user/file-list.txt"
	Dim iNum as integer
	Dim currentFile as String
	Dim oDoc as Object

	If FileExists(fileName) Then
		ON ERROR GOTO FileError
		iNum = FreeFile
		OPEN fileName for input as #iNum
		while not EOF(iNum)
			' read in file name
			LINE INPUT #iNum, currentFile
			' open document in OpenOffice
			Dim NoArgs() ' empty array
			oDoc = StarDesktop.loadComponentFromURL(ConvertToURL(currentFile), _
				"_blank",0,NoArgs())
			if not isnull(oDoc) then
				' do something
				AddHeader(oDoc)
				SetJustified(oDoc)
				' save and close
				oDoc.Store
				oDoc.Dispose
			end if
		Wend
		CLOSE #iNum
FileError:
		If Err  0 Then
			Msgbox("Error processing " & currentFile, 64, "Error")
			exit sub
		End If
	Else
		Msgbox("File " & fileName & " does not exist.", 64, "Error")
		exit sub
	End If
End Sub

Adding macros to OpenOffice menus

Using Tools->Configure... dialog box, you can add items that run macros
to OpenOffice.org menus. The name of the menu item added
is the name of the subroutine which is executed -- which doesn't look very good
and can be confusing. Fortunately, you can change menu item names to arbitrary
text by editing XML configuration files.

The menu layout for Writer is stored in
~/OpenOffice.orgN.N.N/user/config/soffice.cfg/writermenubar.xml.
After creating menu items for your macros using the Configure... dialog, find the
corresponding menu items in the abovementioned file and change their
menu:label properties to more human-readable strings.
All strings must use UTF-8 encoding. Note that this file should not be edited
while OpenOffice.org is running -- on exiting, OO will overwrite the modified
menu layout with its previous version. Also, if you use the OpenOffice Quickstarter,
you will need to close it and restart OpenOffice in order for changes in the
XML file to take effect.

Conclusion

StarBasic macros saved the day by providing a way of automatically modifying
multiple documents and fixing formatting lost during conversion of files from
another format. Thanks to the macros, deficiencies in the converting utility didn't
ruin the whole migration plan. If the migration took place today, there would
be less need for correcting errors in formatting, thanks to the advancement of
wpd2sxw.

Of course, OpenOffice.org macros aren't limited to just dealing with
text formatting. They are very helpful in customizing the office suite. Unfortunately,
sometimes using macros is also necessary for adding functionality many other
packages offer built-in (e.g. word-count statistics and binding characters to custom
keys). Macros also allow automating often-performed actions. Basics of this macro language are
easy to learn, and with all the resources and examples available on the Web,
writing your own macros isn't very hard.

MichaƂ Kosmulski is a student at Warsaw University
and Warsaw University of Technology.

An updated version of this article is available at the author's Web site.

Click Here!