Using portable, multi-OS sound systems

In the first part of this article, we talked about several different sound systems and APIs that were available for Linux and its various desktop environments. Many developers, however, need to write applications that will work across multiple environments, including different operating systems. Here’s how we can accomplish this.

The idea for getting multi-system portability to work is to add one more layer on top of the existing ones, and to make calls to that higher API that will detect what systems are available and use them. This works as an abstraction library which makes the developer’s life easier and the lower levels completely invisible in the program. There are a few projects that tried to fulfill this idea:

libao — This library is available on most Linux distributions and works with many sound systems classed in levels of priority: priority 30 -> ALSA, 20 -> OSS, Irix, Sun, 10 -> esd, aRTs, 0 -> NULL, and file outputs. The back end tries to find a sound system starting with the highest priority and lowering it until one if found.
The only real problems with libao are that the API lacks power and the fact that only sound output is supported. That makes libao useful for application needing simple sound output without having to care about the sound system.
CSL — The Common Sound Layer seems to be an almost dead project now (its last infos are dated from 2001) and supports only aRTs and OSS, but it supports them in full duplex, with more options and with latency management.

PortAudio: the holy grail?

PortAudio is a free (under an MITish license, so GNU compatible) powerful cross-platform audio library that works on Windows, Macintosh (8,9,X), Linux, FreeBSD, Solaris, SGI, and BeOS. The latest stable version (V18) supports an impressive number of sound systems: Windows DirectSound, Windows MME, Macintosh SoundMgr for OS7-9 and CARBON, Core Audio for OS X, OSS, ASIO for Mac and Windows, Silicon Graphics Irix, and BeOS. It works with the same interrupt-driven method as JACK, but an extra utility called PABLIO also enables a programmer to access the audio stream as a file by writing to a FIFO which is read by the callback. Notes on compilation can be found in the PortAudio Tutorial. Here comes a tiny piece of code :

#include "stdio.h"
#include "portaudio.h"

static int myCallback(void *inputBuffer, void *outputBuffer,
                       unsigned long framesPerBuffer, PaTimestamp outTime, void *userData)
{
    float *out = (float *) outputBuffer;
    float *in  = (float *) inputBuffer;
    float leftInput, rightInput;
    unsigned int i;
    if (inputBuffer == NULL) return 0;

    /* Read input buffer, process data, and fill output buffer. */
    for(i=0; i<framesPerBuffer; i++)
    {
        leftInput = *in++;     	/* Get interleaved samples from input buffer. */
        rightInput = *in++;
        *out++ = (...)       	/* L output treatment */
        *out++ = (...)      	/* R output treatment  */
    }
    return 0;
}

int main(void)
{
    PortAudioStream *stream;
    Pa_Initialize();
    Pa_OpenDefaultStream(
        &stream,
        2, 2,            	/* stereo input and output */
        paFloat32, 44100.0,	/* PA can use different data types. Here 32bits floats */
        64, 0,          	/* 64 frames per buffer, let PA determine numBuffers */
        myCallback, NULL);
    Pa_StartStream(stream);
    Pa_Sleep(10000);    	/* Sleep for 10 seconds while processing. */
    Pa_StopStream(stream);
    Pa_CloseStream(stream);
    Pa_Terminate();
    return 0;
}

The underlying systems are completely invisible to the programmer and the API is quite simple and has great capabilities. PortAudio really looks perfect, but there’s one thing: what about ALSA or JACK support? The latest stable version, V18, was released in 2001, and since then PortAudio developers decided to write a brand new support library and API (version 2) as well as many improvements, among which support for ALSA and JACK. This development version, V19, is already usable and the API is frozen, so, even if many things are still not running completely and some V18 support hasn’t been backported (especially Mac stuff), you can already use it in your applications.

**Summary table**
	Type	Multiplexing/full duplex	API	Design	Latency	Portability	Other good/bad points
OSS/Free	Driver	None	-power-simple	Device file access	Low, blocking I/O	Unix	Deprecated
ALSA	Driver	Potential SBSM and FD	Powerful	File-like/Callback	Low	Linux
ESD	SS	SBSM and FD	Simple	File-like	medium	E/GNOME	N.T., popular
aRTs	SS	SBSM and FD (buggy?)	Simple	File-like	High	KDE	N.T., good sound
JACK	SS	SBSM and FD	Powerful	Callback	Low	POSIX	So much to say!
libao	Abs. Lib.	Output only	Simple	File-like		OSS, ALSA, esd, NAS, and Unixes	No JACK or Win
PortAudio	Abs. Lib.	SBSM and FD	Powerful	File-like/Callback		Too many	No ALSA\|JACK
PA V19	Abs. Lib.	SBSM and FD	Powerful	File-like/Callback		Like V18 plus ALSA and JACK	Designed but unfinished

Conclusion

My goal in these articles was to give an overview of the audio systems available for programmers who wish to make use of sound in their applications. I’ve omitted some sound systems that were too specific (NAS and OpenAL by Loki Software) or part of a bigger project and whose linking to the program would make the binary too heavy (SDL and GStreamer).

I think that ALSA and JACK are the future of sound under Linux, and PortAudio V19 will certainly be a safe choice for programmers seeking for compatibility with other systems. That’s why I suggest programmers use the power of these systems unless they have specific needs which would be better suited by others.

Vincenot has been a Linux user for eight years, and is currently a student at University Louis Pasteur in Strasbourg.

RELATED ARTICLESMORE FROM AUTHOR

June Jumpstart

How to Deploy Lightweight Language Models on Embedded Linux with LiteLLM

Automating Compliance Management with UTMStack’s Open Source SIEM & XDR

Using OpenTelemetry and the OTel Collector for Logs, Metrics, and Traces

Xen 4.19 is released

RELATED ARTICLES MORE FROM AUTHOR