February 23, 2005

Translating With OmegaT

Author: Dmitri Popov

Although computers have yet to take over the business of
language translation (if they ever will!), they have become a common part of the translation process. Many professional translators use
computer-assisted translation (CAT) tools such as TRADOS,
Déjà Vu, and WordFast. But a less-known, yet excellent, open source CAT application called OmegaT can help as well.

Before you begin exploring OmegaT yourself, you should
understand how it, or any CAT tool, works.
OmegaT is a so-called translation memory application; that is, it
doesn't translate texts for you. Instead, it stores pieces of text
(called 'segments') and their corresponding translations in a file
called 'translation memory.' During translation OmegaT divides the
translated text into conceptual segments. When you select a segment for translation, OmegaT scans the translation memory for possible matches, and displays found matches in a separate window. The translator running the application can insert the closest matching translation into the text.

There are two types of matches: exact and fuzzy. If the segment in the
current text is identical to the one stored in the translation memory,
you have an exact match. In the real world, however, you rarely have
exact matches -- some words or forms in a selected segment vary from the
segment in the translation memory. Luckily, OmegaT supports partial
matches, which in translation lingo are called 'fuzzy matches.' This
means that OmegaT can find segments in the translation memory that are
not identical to, but are similar to, the one in the current text.

Before OmegaT can be really useful, you
have to use it for some time in order to build up a usable translation
memory. The good news is that OmegaT works with translation memories in
TXM format, which is supported by almost every CAT tool on the market,
so you can easily use existing translation memories, and exchange
memories with other users.

OmegaT advantages

While OmegaT lacks some of the advanced features of commercial
packages, it has quite a few strong points.

  • OmegaT is a Java-based application, which means that it can run
    on Windows, Linux, and Mac OS X. Most commercial CAT tools are
    available for Windows only.
  • OmegaT can work with multiple translation memories. You can
    easily combine several translation memories and use them for a
    particular project.
  • Tools such as TRADOS or WordFast are tied to Microsoft Word, meaning
    if you want to use them you have to use Word. OmegaT is a standalone
    application, which doesn't dictate which text processor to use
    (although OpenOffice.org significantly increases its usefulness).
  • OmegaT is distributed under the GPL, free of charge. Since many CAT
    tools are aimed at professional translators, they tend to be

Installing and using OmegaT

OmegaT requires that the free Java Runtime
Environment (JRE) is installed on your computer before you install OmegaT. Once the JRE is installed, download OmegaT and install it on your computer.

Now it's time to create your first translation project:

  1. Launch OmegaT.
  2. Create a new project by selecting File > Create New Project
    from the menu bar.
  3. Choose a directory for your project, give it a name, and save it.
  4. You will then see a dialogue window that allows
    you to specify directories and language settings for the project. Use
    the default folder paths unless you have a very good reason to change
    them. Enter the language and local codes for your source and target
    languages -- for example, FR-FR for French (France) and EN-US for English
    (United States). Press OK when done. (The actual codes are not particularly important unless you
    intend to exchange translation memories with other users. In that case
    you should use the ISO language codes.)
  5. OmegaT will create a project folder (also called 'project root
    directory') containing five subdirectories: /glossary, /source,
    /omegat, /target, and /tm.
  6. Quit OmegaT by choosing File > Quit.

Now you have to add the documents you want to translate (also called
'source documents') to your project. OmegaT supports plain text, HTML,
XHTML, StarOffice, and OpenOffice.org formats (including Writer, Calc,
and Impress). To add source files in these formats to the project, put
them into the /source folder.

The good thing about OmegaT is that you can work with several documents
in different formats within one project; that is, one project can
contain Writer documents, HTML pages, Impress presentations, and so on.
Although OmegaT cannot work with Microsoft Office documents directly,
you can convert them into OpenOffice.org formats, translate them, and
then convert them back to the original formats.

Now that the project is populated with the
necessary files, you are ready to do some translation work. Launch OmegaT, and choose File > Open. Point to the project folder and double-click on the omegat.project file inside it. If your project
contains more than one source file, OmegaT opens the Project Files
window, where you can choose the document you want. In the Project
Files window you can also see a number of segments in each document,
which can come in handy when you have to estimate the amount of work.

Once you have opened a document you can begin translating it. The translation process using OmegaT is straightforward:

  1. Place the cursor in the target field of the first segment,
    between the tags <segment 0001> and <end segment>.
  2. Type in your translation and delete the original text. Press
    Enter to confirm the translation and to jump to the next segment.
    Repeat this process with each segment until you have translated all of
    the text.
  3. Select File > Save, then File > Compile. The Compile command
    does two important things: it creates a translated version of the source
    document (target text), and generates a translation memory.
  4. Quit OmegaT by choosing File > Quit.

You will find your translated file in the /target folder, and the
updated (or new) translation memory in the /tm folder. The translation memory can then be used with any other translation project.

Using OmegaT with existing translation memories

If you translate a text using an existing translation memory, OmegaT
will display possible matches for the active segment in the upper part
of the Match and Glossary Viewer window. OmegaT can display up to five
fuzzy matches and you can select the closest one by selecting the
Select Fuzzy Match # command from the Edit menu. You can paste the selected match into the active segment using Edit > Insert Translation
(inserts the match at the cursor position) or Edit > Overwrite
Translation (substitutes it for the active segment text).

OmegaT also allows you to search translation memories and project
files. Choose Edit > Search
Translation Memory to open the Search dialogue window, enter the word
or phrase you wish to search for in the Search for field, and press

OmegaT supports keyword and exact searches. Keyword searches find text
fragments containing all the search words, similar to a search
using the AND operator (red AND balloons). Keyword searches can only
find whole words. An exact search finds text fragments that contain the
exact matches of the search term, which can be one word or a phrase. Exact searches can search source and target segments in the current project, translation memories (files in
/tm), and any file in a format OmegaT can read, in any selected
directory and subdirectories.

We've just touched upon the basics of OmegaT, and there is much more to
the application than meets the eye. If you want to get the most out of
using OmegaT, be sure to read its documentation. Even if
you only need to translate a business letter or a product sheet every
now and then, you can benefit from using OmegaT.

Dmitri Popov is a freelance
contributor and an avid OpenOffice.org user. His articles have appeared
in Russian, British, and Danish computer magazines.

Click Here!