CLI Magic: Transform your audio files with SoX

1503

Author: Shashank Sharma

Sound eXchange (SoX) is a command-line sound sample translator. This Swiss Army knife of sound tools can be used to convert file formats of your audio files, and to apply sound effects such as echo, fade-in/out, and chorus to jazz up your music with just a few keystrokes.

If you don’t already have SoX installed on your system, the source is available on the project page. Fedora and Debian users can get it with yum and apt-get respectively.

SoX supports more than 20 formats; for a list of supported file formats, run sox -h. Converting files is as simple as sox file.mp3 new.ogg.

SoX can work with both self-describing audio file formats and header-less data files. The self-describing formats contain a header that specifies all the characteristics of the audio file, such as audio sample rate, data size, encoding, and number of channels, and can additionally contain information such as a copyright notice or a description of the audio file. The header-less data, also known as raw data, provides no information about the audio. To determine what SoX can tell you about the file, use a command like sox filename.wav -e stat.

Basic effects

While converting between formats does come in handy at times, I use SoX primarily to apply effects to my music collection. The syntax for applying effects to self-describing formats is sox inputfile outputfile effects. Use sox OPTIONS inputfile outputfile effects if you are working with raw data, where OPTIONS describes the sample rate, number of channels, and other such characteristics. By default, SoX does not overwrite the input file with any new effects, and only one effect can be applied to a file at a time.

To remove parts from either end of an audio file, use the -trim function. This is useful when you want to remove silence from the beginning or end of a song. It accepts two values, start and length. start defines the starting point and length the duration of the file — not a position in the track. Both values are in seconds.

The trim function is a little tricky to understand. Suppose you wish to create a new file with only the first 20 seconds. sox musicfile.wav new.wav trim 0 20 would create new.wav which is just a 20-second clip from musicfile.wav. The initial zero is the start and 20 is the duration. Similarly, if you want to remove the first 20 seconds of a file and keep the rest, use the command sox musicfile.wav newfile.wav 20 740. Here 20 is the start point and 740 is the original length minus the first 20 seconds. You can determine the length of the track with sox filename.wav -e stat.

You can add a fade-in or -out effect to a song. The former increases the volume from zero to maximum slowly over a few seconds while the latter decreases the volume before the song ends. You need to know the length of the track to apply a fade-out effect.

The syntax in the man page, fade fade-in-length [ stop-time [ fade-out-length ] ], is a little hard to interpret. Fade-in/out effects can be applied to a file simultaneously. So, sox song.mp3 faded.mp3 fade 5 240 8 creates a fade-in/out effect, where volume increases from zero to maximum in 5 seconds. 240 is the stop time or the total length of the file and 8 is the fade-out length. You don’t need the second and the last value if you only want a fade-in effect. Similarly, setting the first value to zero would create only a fade-out effect.

Other effects

I find it amusing to play with the volume knob, raising and lowering the volume, whenever a friend is driving me someplace. You can get the same effect with vibro, which uses a sine wave as the volume knob. sox sound.wav trouble.wav vibro speeddepth creates the effect. Speed is the value of the wave in Hz — it cannot be more than 30 — and depth is the volume level that the sine wave cuts. The default depth is 0.5 and maximum is 1.0.

To speed up or slow down the sound of a file, use speed to modify the pitch and the duration of the file. sox audio.wav new.wav speed factor raises the speed and reduces the time. The default factor, 1.0, makes no change to the audio, while a 3.0 factor would reduce the time to a third and triple the pitch. To change only the pitch of a file without altering its duration, you should use pitch.

Effects like chorus, echo, and phaser require mostly the same options. You need to specify gain-in, gain-out, delay, and decay, and some other options. For example, to apply an echo effect, run sox file.wav withecho.wav echo gain-in gain-out delay decay. The gain-in and gain-out are volume levels, and delay and decay are specified in milliseconds. Using longer delays will give you an “open air concert in the mountains” effect. Sadly, these effects cannot be used on an MP3 file as they provide no encoding support.

Conclusion

While not as sophisticated as Audacity, SoX has its own advantages. The various effects and their option require some getting used to, but once you read the man page, you are pretty much put on track. You’ll need to experiment a little with the delay/decay, fade-in/out, etc., and this is when soxexam, a SoX examples utility, comes handy. Run man soxexam for detailed examples that explain the various options of several effects.

Shashank Sharma is studying for a degree in computer science. He specializes in writing about free and open source software for new users.