September 13, 2006

OOoBasic crash course: Replacement therapy

Author: Dmitri Popov

In a perfect world everyone would write in standard English and all publications would use a universal style guide. In the real world, however, you have to deal with different versions of English (British, American, Australian, etc.), and every publication has its own set of writing guidelines. If you write for several markets, things can get pretty complicated. But instead of wasting time on language idiosyncrasies, you can let an OOoBasic macro do the donkey work. Let's create a macro that converts from British English to US English. You can easily modify it later for other text conversion purposes.

The building block of this macro is a "replacement" object, which has the properties necessary to perform the replacement operation. If you define the oSearch variable as such an object, its "replacement" properties will be oSearch.SearchString, oSearch.ReplaceString, and oSearch.SearchWords. The first two define the search and the replacement strings, while the last one specifies whether the search should find only whole words. To see how this works in practice, let's create a simple macro that replaces all occurrences of the word 'colour' with its US English equivalent 'color':

  Sub ColourToColor()
  Dim oDoc As Object
  Dim oSearch As Object, nTimes As Long
  oDoc = ThisComponent
  oSearch = oDoc.createReplaceDescriptor
  with oSearch
    .SearchString = "colour"
    .ReplaceString = "color"
    .SearchWords = true
  end with
  nTimes = oDoc.replaceAll(oSearch)
  End Sub

As you may notice, the oSearch object's properties are defined using the With statement. This statement allows you to specify multiple properties in a compact and elegant form. And the entire replacement magic is done using just a single line of code: nTimes = oDoc.replaceAll(oSearch).

The next step is to tweak the macro so it can replace not just a single word, but multiple words. One way to do this is to use arrays. For this macro, we need two arrays: the UK array containing a list of British words, and the US array with a list of their US equivalents. Here is an improved version of the macro:

  Sub BritishToUS()
  Dim oDoc As Object
  Dim oSearch As Object, nTimes As Long
  UK = Array("colour","favourite", "website", "ise")
  US = Array("color", "favorite", "Web site", "ize")
  oDoc = ThisComponent
  oSearch = oDoc.createReplaceDescriptor
  For I = lBound(UK()) to uBound(UK())
   with oSearch
    .SearchString = UK(I)
    .ReplaceString = US(I)
    .SearchWords = false
   end with
   nTimes = oDoc.replaceAll(oSearch)
  Next I
  MsgBox ("All done!")
  End Sub

Several things in this code deserve a closer examination. First of all, the macro uses the For Next loop that runs until a certain condition is met. In this case, the loop runs the replacement operation until it reaches the top of the UK array. Each array in OOoBasic has the upper and lower bounds, and the lBound and uBound functions return the array's "bottom" and "top" values. The For I = lBound(UK()) to uBound(UK()) statement basically means that the replacement operation starts at the bottom of the UK array and runs until it reaches the top. The .SearchString is defined as a current item of the UK array, while the .ReplaceString is defined as the respective item in the US array. Notice also that the .SearchWords property is set to false in order to replace the -ise endings with -ize.

Speaking of the endings, treating them correctly can be quite tricky. The problem is that there are words that contain the "ise" string in the middle ("bisect" is one such word), and the macro must leave these words untouched. A quick-and-dirty fix to this problem is to enable search with regular expressions and specify a new replacement string. In this case, you want to replace endings, which are strings with an empty space on the right. To find the -ise and -ised endings, use the ise>\ and ised>\ regular expressions. Here is the final version of the macro with the regular expressions enabled (the .SearchRegularExpression property is set to true):

  Sub BritishToUS()
  Dim oDoc As Object
  Dim oSearch As Object, nTimes As Long
  UK = Array("colour","favourite", "website", "ise\>", "ised\>")
  US = Array("color", "favorite", "Web site", "ize", "ized")
  oDoc = ThisComponent
  oSearch = oDoc.createReplaceDescriptor
  For I = lBound(UK()) to uBound(UK())
   with oSearch
    .SearchString = UK(I)
    .ReplaceString = US(I)
    .SearchWords = false
    .SearchRegularExpression = true
   end with
   nTimes = oDoc.replaceAll(oSearch)
  Next I
  MsgBox ("All done!")
  End Sub

Now you know how to perform the replacement magic and use regular expressions to solve complex replacement cases. That's all for today -- class dismissed.

Huge thanks to JohnV from OOoForum for helping the author with this macro.

Dmitri Popov is a freelance writer whose articles have appeared in Russian, British, German, and Danish computer magazines.

Click Here!