Linux.com

Feature

A Festival of speech synthesis for Linux

By Rohit Girhotra on June 21, 2005 (8:00:00 AM)

Share    Print    Comments   

As information technology becomes more pervasive, the issues of communication between information-processing machines and people becomes increasingly important. Up to now such communication has been almost entirely by means of video screens. Speech, which is by far the most widely used and natural means of communication between people, is an obvious possible substitute. However, this deceptively simple means of exchanging information is, in fact, extremely complicated. Festival Speech Synthesis System aims to make things a little easier on interface developers.

Speech synthesis -- automatic generation of human speech waveforms without directly using a human voice -- has been under development for decades. Speech synthesizers, often called text-to-speech (TTS) synthesizer systems, can be implemented in either software or hardware. The first commercial speech synthesis systems were mostly hardware-based, and their development process was time-consuming and expensive. Since computers have become more powerful, most synthesizers today are software-based. Software-based systems are easy to configure and update, and much less expensive than their hardware counterparts.

You can find a wide array of software tools for speech synthesis, ranging from commercial products to software for download over the Internet, with varying kinds of licensing. Some commercially available TTS systems include:

Recently, the speech research community has been turning toward open source software, as exemplified by toolkits such as CSLU toolkit, the ISIP Automatic Speech Recognition toolkit, and the Edinburgh speech tools, all of which can help your computer find its voice.

There are many advantages to using open source software for research work. Frequently a researcher is faced with a tool that almost does the task at hand, but needs some tweaking. Having access to the source code allows the researcher, at least in theory, to make the needed modifications. But mere openness is not a guarantee of flexibility. In order for a tool to be flexible, it must have well-defined programming interfaces -- otherwise, extensions and modifications will be hard to develop and maintain -- and it must be interoperable with other tools.

Festival Speech Synthesis System is one such tool. Festival grew out of the need for a unifying, flexible, and extensible tool for research and educational purposes at The Centre for Speech Technology Research (CSTR) at University of Edinburgh.

Festival is a free, portable, extensible, language-independent, run-time speech synthesis engine for various platforms that has been under development since 1999. Primary authors of the C++ system include Alan W Black, Paul Taylor, and Richard Caley. Festival is a part of the Festvox project that aims to make the building of new synthetic voices more systematic and better documented, making it possible for anyone to build a new voice.

Festival offers developers a basic framework for building speech synthesis systems, and includes various demo modules. It offers text-to-speech through a number of APIs: from shell level, though a Scheme command interpreter, as a C++ library, from Java, and even via an Emacs interface. Though Festival is multi-lingual (currently English, Welsh, and Spanish), support for English is the most advanced. The system uses Edinburgh Speech Tools for its underlying architecture and has a Scheme-based (SIOD) command interpreter for control.

The Festival Speech Synthesis System was designed to target three classes of speech synthesis users:

  • Speech synthesis researchers, who may use Festival for developing and testing new speech synthesis methods;
  • Speech application developers, who are developing language systems and wish to include synthesis output, such as different voices, specific phrasing, and dialog types; and
  • End-users, with systems that take text and generates speech, requiring little configuration from users.

Taking stock of the Festival Speech Synthesis System

Want to try Festival version 1.95 on your Linux system? First, ensure that you have the latest working version of the C++ (gcc) compiler installed on your system. Most of the problems people have had in installing Festival have been due to an incomplete or bad compiler installation. Also make sure that your sound card is configured and working correctly.

To install Festival you will need to download the following source packages from the Festival download page:

  • festival-1.95-beta.tar.gz -- The core festival package
  • speech_tools-1.2.95-beta.tar.gz -- The Edinburgh Speech tools library
  • festlex_OALD.tar.gz -- The lexicon distribution
  • festlex_POSLEX.tar.gz -- The lexicon distribution
  • festvox_rablpc16k.tar.gz -- The speech database

You will find several other packages available for download at the Web site, but you won't need them unless you wish to add support for more voices to the basic TTS system.

Having downloaded all the above packages, log in as root, change to the directory where you downloaded the packages, and issue the command tar --xvzf package_name to unpack them. After the unpacking, your current directory will contain the subdirectories speech_tools/ and festival/.

Next, you need to compile the source files. Change to the speech_tools directory and issue the commands:

<nobr> <wbr></nobr>./configure
make

Then change to the festival directory and issue the same commands.

That's it! The Festival Speech Synthesis System is now installed on your Linux box.

Using Festival

There are various ways you can use Festival. To get into the Interactive Festival Console, type festival at the shell prompt. You should find yourself at a prompt like the one below:

festival>

Your speech synthesis system is now ready to accept input. To get your system to talk to you, try out the following command:

festival> (SayText "type the text you want to hear over here")

The parentheses are required here, and the text to be spoken must be enclosed in double quotes.

If you have a text file with something in it that you want to hear, use the command:

festival> (tts filename)

Replace filename with the relative path to your file, and make sure that the file is a plain ASCII text file. You can use the Tab key here for automatic file name completion.

If you have a plain ASCII text file that you wish to hear, you can call Festival from the command prompt:

festival --tts filename

For more information on using Festival, check out the man pages or type help at the festival prompt to see a list of useful commands. More documentation is also available in texinfo and HTML format on the project's site.

Flite and other open-source TTS alternatives

An alternative TTS engine is Flite (Festival-lite), a small, fast run-time binary speech synthesis engine. Flite was designed for embedded systems like PDAs as well as large server installations, which must serve synthesis to many ports. It was written in ANSI C, and is designed to be portable to almost any platform.

Other freely available open-source TTS systems are:

  • MBROLA, a freely available diphone concatenation system
  • Gnuspeech, an extensible TTS package based on real-time, articulatory, speech-synthesis-by-rules
  • FreeTTS, written entirely in the Java, based upon Flite
  • Epos, a rule-driven TTS system primarily designed to serve as a research tool

Share    Print    Comments   

Comments

on A Festival of speech synthesis for Linux

Note: Comments are owned by the poster. We are not responsible for their content.

use a package

Posted by: JelleB on June 22, 2005 12:55 AM
All(most) linux distro's have been using a package manager for a very long time. Installing software outside of the package manager fucks up the system. Even if you have different needs than the packager, you can still use a package if you recompile it. On my debian testing system I have these hits on festival (I doubt it will be much different on other distros):

$>apt-cache search festival
eflite - Festival-Lite based emacspeak speech server
festival - general multi-lingual speech synthesis system
festival-dev - development kit for the Festival speech synthesis system
festival-doc - Documentation for Festival
festival-freebsoft-utils - Festival extensions and utilities
festlex-cmu - CMU dictionary for Festival
festlex-poslex - Part of speech lexicons and ngram from English
festvox-kallpc16k - American English male speaker for festival, 16khz sample rate
festvox-kallpc8k - American English male speaker for festival, 8khz sample rate
festvox-kdlpc16k - American English male speaker for festival, 16khz sample rate
festvox-kdlpc8k - American English male speaker for festival, 8khz sample rate

So the advice is simple: use your distro's package manager. But that leaves less page filler for the author, so we'll keep seeing this<nobr> <wbr></nobr>./configure make make install advice some more probably.

#

Re:use a package

Posted by: Anonymous Coward on June 22, 2005 04:47 PM
All(most) linux distro's have been using a package manager for a very long time. Installing software outside of the package manager fucks up the system.

Can you elaborate on this? This is a rather questionable claim, based on my experience.

#

Re:use a package

Posted by: Anonymous Coward on June 24, 2005 06:23 AM

Can you elaborate on this? This is a rather questionable claim, based on my experience.



I'm not the parent poster. I'm a software developer and install non-packaged software all the time. Having said that I do prefer well-built (not half-assed) packages with accurate dependencies on my current distribution of choice (Debian/Ubuntu).



The main reason many people like packages is that installing non-packaged software, particularly large complex software like Festival, is dangerous.



Because non-packaged software is, almost by definition, non-standard for your system there is a much higher risk that files will get written to the wrong place, important files will get overwritten, the installation will be faulty and/or it will be painful and time consuming to uninstall.



In the case of Festival problems might be hooking into the audio device (/dev/audio,<nobr> <wbr></nobr>/dev/audio0,<nobr> <wbr></nobr>/dev/dsp, symlinks in<nobr> <wbr></nobr>/dev/, esd, alsa, what?), where it is installed (/opt,<nobr> <wbr></nobr>/usr/local/, ~/, some disk with no space, what?), where the documentation is put (/usr/share/man,<nobr> <wbr></nobr>/usr/share/doc,<nobr> <wbr></nobr>/usr/info,<nobr> <wbr></nobr>/opt/doc, what?) whether the install is root or per-user, whether there are library version interdependencies, whether a system daemon with security implications is installed, where config files are put, where error log messages end up etc. etc.



When a software release is packaged all these questions are likely to have been reviewed by the package creator and reasonable defaults for the particular platform chosen. A package manager like Synaptic allows software and its dependencies to be installed and uninstalled in seconds, non-packaged software is simply more variable, risky and potentially time consuming.



For non-packaged software I'm reasonably happy if it installs everything in<nobr> <wbr></nobr>/usr/local/ with appropriate ownership and protections and documents in the topmost README what the system dependencies are (e.g.<nobr> <wbr></nobr>/dev/audio) so I can quickly check them manually if necessary.



It's when software installs start modifying things outside of<nobr> <wbr></nobr>/dev/local/ that I start to get antsy. Strictly speaking packages should be installed in<nobr> <wbr></nobr>/opt/ however the de facto standard on Linux is<nobr> <wbr></nobr>/usr/local/ and I'm happy to stay with that because libraries, documentation etc. have standard installation directories in<nobr> <wbr></nobr>/usr/local/ unlike in<nobr> <wbr></nobr>/opt/.

#

Re:use a package NOT

Posted by: Anonymous Coward on June 23, 2005 05:26 AM
Festival needs to install on many more systems than just Linux, so keep the tarball. It sounds like JelleB can make a package on their own, and should. If you can't understand the flexibility and leverage the configure/make packager delivers, maybe you should avoid compiling altogether.

#

Speech recognition?

Posted by: Anonymous Coward on June 23, 2005 04:27 AM
Text-to-speech is great but, Festival has had that problem licked for some years now. It works quite well, is free and extensible. Despite its extensibility, very few people are working on it even just adding voices/diphones. I think this is primarily due to the fact that Festival already works so well that there is little need for people to improve it.

But, speech recognition is another story entirely and, in my opinion, a much more interesting and useful one. But, there does not seem to be anything happening in the speech recognition arena on Linux. Several old/dead projects were started but, these all relied on IBM's ViaVoice and it seems that IBM have discontinued ViaVoice on Linux. This is not only a great shame but also surprising given IBM's recent infatuation with Linux.

Can anyone provide a link to a good and, most importantly, current Linux speech recognition project?

#

Cepstral Text-to-Speech

Posted by: Cepstral on June 28, 2005 04:17 AM
I found it ironic that Cepstral was the one Text-to-Speech company not mentioned in this article. Ironic because their co-founder, Dr. Black, was a principal author of Festival. Furthermore, Cepstral is deeply committed to the Linux platform. Their voices work well with Asterisk, the primary Linux telephony application.

Cepstral's high-quality voices have been ported to many OS platforms (Linux, Mac OS X, Solaris, Embedded Linux, Windows, WinCE and Palm.)

I think at $30 each, Cepstral's David and Diane voices are the best value on the market. (<a href="http://www.cepstral.com/" title="cepstral.com">http://www.cepstral.com/</a cepstral.com>

#

10 years from now

Posted by: Anonymous Coward on June 21, 2005 06:55 PM
I wonder how computer speech will look like 10 years from now...

#

This story has been archived. Comments can no longer be posted.



 
Tableless layout Validate XHTML 1.0 Strict Validate CSS Powered by Xaraya