February 15, 2006

ENLASO releases open source localization tools for Okapi

Author: Corinne McKay

The growing array of open source tools available to translators and localization professionals got a little bigger recently, when ENLASO Corp. began to port a set of its localization tools to the open source Okapi Framework, which is released under the GNU Lesser General Public License (LGPL).

Okapi, which runs on Windows .Net but not on Mono, includes three applications: Rainbow, a .Net application that provides a graphical user interface to Okapi's utilities and filters; Olifant, a .Net application for creating and managing translation memories; and Tikal, a .Net console application for executing utilities from a DOS command line. Okapi was originally available as freeware, and made the move to open source in late 2005.

Yves Savourel, the Okapi Framework's lead developer, identified three main aspects of the components that make up the Okapi framework: interface specifications, components, and applications. At present, all of Okapi's implementations are written in C#. In addition to the Rainbow, Olifant, and Tikal applications, Savourel says Okapi includes useful utilities such as a text extraction utility that converts translatable text from input files into translatable formats such as RTF or XLIFF, a text merging utility that returns XLIFF-formatted files to their original format, and a text rewriting utility that extracts and merges translatable text in one step.

"For example," Savourel says, "you can pseudo-translate the text, or remove it all (so you can compare two files with just the codes). All utilities are fed from filters. Currently there are five filters implemented: for PO files, for Properties files, for .Net resources, for Wordfast TM, and for Trados Text TM. All that is for the last release, but we have already more filters and utilities under development for the next one."

Savourel says ENLASO's decision to offer Okapi under an open source license "provides us with a more diverse and larger set of testers, allows others to participate and work on parts of the toolset which are maybe less urgent for us and for which we have less time to work on, and ensures better continuity in the development process. If people working on the tools leave the company, they can still continue their participation afterward. When needed, Okapi allows us to provide our customers with solutions based on non-proprietary software."

Other translation and localization tools have made the move from proprietary licenses to open source. For example, Lionbridge released the code to ForeignDesk in November 2001, and it has enjoyed a great deal of user enthusiasm -- but not a great deal of development activity. However, Savourel says he is confident that development on the Okapi Framework will continue at a healthy pace.

New features, such as a Machine Translation query interface using Google's translation engine, have already been added to the latest release. "People are understandably cautious about offering to help out," Savourel says. "We have to prove that our framework is not just a way to use open source as a marketing tool, and that's fine."

In future releases of the framework, Savourel said he hopes to see features such as a segmentation component that supports Segmentation Rule eXchange and is compatible with the segmentation rules used by major commercial translation tools, a regular expressions-based script filter, "and many more little and big ideas." Savourel also notes that users and developers won't have to wait as long to have access to new features.

"One of the perks of open source tools is that you don't have to wait for releases to see them. You can always download the full source code of the project and get a look at the latest 'development' version."

Translation memory creation and management is big business, and the Okapi framework's release comes at an interesting time in the translation technology industry. Last summer, SDL, a translation tool vendor as well as a translation and localization provider, purchased the market leader translation memory software provider, Trados, and announced plans to merge its own tool, SDLX, and Trados into one application in the future.

Many translation technology watchers saw the purchase as potentially opening the field to smaller translation software providers. Freelance translators have lamented the relatively high cost of these tools ($695-$895 for freelance editions) for years, and the as yet undefined tool that will result from the union of Trados and SDLX is making some translators wary of investing in either one.

According to the Okapi site, the Okapi toolkit isn't meant to replace commercial translation or localization software. "Rather, this framework aims simply at helping localization-related applications to thrive in an environment where they can interact, and where the users can choose the most appropriate tool for the task at hand." However, Savourel theorizes that its various components appeal to a wide range of users, and since Okapi is licensed under the GNU LGPL, it can be used to create commercial software; ENLASO itself has kept open the possibility of developing proprietary tools that use Okapi.

"An application like Olifant (a TM manager) is more oriented toward translators, project managers and to some degree localization engineers. The low-level components like the filters are more for power-users or engineers, people who write scripts to automate some of the localization process or testing and QA. And obviously any part of the Framework can also be reused in large applications, so developers can also find something to utilize. Utilities like the Encoding Conversion can be handy for just about anyone, even outside the localization and translation industry."

Now that the Okapi Framework is available to the public under an open source license, Savourel hopes to "get some help with developing, testing, and documenting. This will allow us to go a little further with the tools, to provide new functionalities that we may not have been able to tackle alone. There are a lot of things to do before we run out of work."


  • Enterprise Applications
Click Here!