March 27, 2006

Mastering podcasts with Audacity

Author: Johnathon Williams

Open source software makes podcasting easy -- too easy. Listening to a playlist of first-timer podcasts can leave your ears ringing from sudden changes in playback volume. The problem is audio mastering. Recording sound is simple, but mastering that sound -- compressing volume differences, maintaining a decibel ceiling, and similar operations -- is anything but. Fortunately, an open source tool offers everything you need for mastering podcasts and other spoken-word recordings. Audacity is well-known among podcasters on all platforms for its ability as an editor; here are some tips and tools for mastering and adjusting volume, aimed at podcasters, but they could apply to anyone who needs to produce a spoken-word recording under less-than-perfect conditions.

Audacity measures volume levels in decibels, or db for short. A level of 0db is the ceiling; anything above that will sound distorted. When recording a podcast, shoot for a level between -7db and -14db. That range provides a signal that is relatively loud, while leaving a comfortable padding at the top. Moreover, the difference between the loudest and quietest signal is only 7db, making it easy on the ears. If your recording falls below -14db, don't worry. Using the techniques presented in this article, I've been able to salvage recordings with levels as low as -25db.

Audacity displays sound as a waveform. Volume is represented vertically, with loud sections showing higher peaks than quiet ones. The waveform, however, provides only a general sense of volume levels. To see an exact decibel reading, highlight a small section of the waveform and select Analyze -> Plot Spectrum. Audacity will display a graph that shows the decibel reading at different points on the frequency spectrum. (Decibels are displayed on the left side of the graph.) Running the cursor over the graph will display the peak decibel reading at each point.

Just as important as the decibel level of a sound is the frequency where it occurs. Frequency is represented by the numbers on the bottom of the graph. Lower numbers represent sounds with lower pitch (think James Earl Jones), and higher numbers represent sounds with higher pitch (think Mickey Mouse). In the Plot Spectrum window, the human voice appears strongest on the left side of the graph, between 86 hertz and 3 kilohertz. When gauging decibel readings for a spoken-word recording, that is the range that matters.

Get familiar with the Compressor

Audacity's Compressor is one of its most useful and least understood effects. With the proper settings, it can automatically remove volume differences across an entire recording. In essence, it's an automated version of Audacity's Envelope tool (more on that later). The difference is that applying the Envelope tool over an entire recording can take hours, whereas running the Compressor takes only minutes.

The Compressor is confusing because it's a two-stage effect. First, it reduces all audio that exceeds a given volume. Second, it boosts the volume of the entire selection. Think of the first step as a leveler; it gets rid of unwanted peaks. Then, with the loud portions gone, the second stage raises the volume of everything, eliminating quiet sections. Because of this two-stage process, how you set the effect is crucial to getting effective results from it.

To get started, highlight your recording, and select Effect -> Compressor. The effect window presents you with three settings: Threshold, Ratio, and Attack Time. The first is the most important. Threshold is the number that acts as the decibel ceiling for the Compressor. Anything above this number will be reduced. Anything below will be left alone. You want to choose a number that's low enough to bring your loudest sections within 7db or so of your quietest sections. (If you run the Compressor and nothing in your waveform changes, then you set the Threshold too high.)

Find and remove noise in your recording environment

Among podcasting sins, too much background noise is near the top of the list. To test for noise in your environment, use Audacity to record a sample of dead air. Then, highlight the sample and click Analyze -> Plot Spectrum. The graph you see reveals the decibel level of the ambient noise in the room.

My home office, where I record all of my podcasts, shows a noise level of about -60db. This has been fine for my purposes. A bare minimum is -50db, but even that can be audible. If your environment shows a higher reading, shush any noise makers and try another sample. Common culprits include fans, central air, and, unfortunately, desktop computers.

In choosing a Threshold, the first thing I do is use the Plot Spectrum tool to identify the decibel levels throughout my recording. First, I highlight what appear to be the loudest sections (the tallest peaks in the waveform) and take a reading of those. Then, I highlight what appear to be the quietest sections (the shortest peaks in the waveform), and do the same. If the loudest section peaks at -10db, and the quietest at -20db, then I choose a Threshold of -15db, or half the difference. If, however, the loudest section peaks at -5db and the quietest at -25db, then I choose a much lower Threshold of -20db.

Again, the key to getting good use from the compressor is setting the Threshold low enough to bring the loud sections closer to the quiet ones. Don't worry about making it too quiet, because the second-stage gain boost will bring everything up. This is especially true for recordings that mix very quiet sections with very loud sections. (Podcasters who conduct interviews using VoIP software run into this all the time. The local microphone is often recorded louder than the remote signal.)

The next setting, Ratio, determines how severely the Compressor will reduce signals that exceed the Threshold. A ratio of 2:1 is very gentle. It cuts any signal above the threshold in half. In general, it's best to use a low ratio when possible, because high ratios sound clipped. But remember, if your recording contains sections that are much louder than your Threshold, then you need a higher ratio, say 6:1.

Attack time determines how quickly the Compressor activates when the Threshold is exceeded. I like a fast setting of 0.2. (Some recommend a slower setting of 0.5.)

Finally, check the box next to "Normalize to 0db." This activates the second-stage volume boost. Click OK, and the Compressor will go to work. If the results are disappointing, just click Edit -> Undo, and try again with different settings. Getting a feel for the Compressor takes some trial and error, but the results are worth it.

Clean up with the Envelope tool

Run the Compressor enough times, and you'll notice an annoying limitation. Normalizing to 0db increases the volume of your recording, but the amount of the increase is limited by the loudest section in your recording. This means that a single leftover spike of -2db will limit the overall increase to only 2db, since the Normalize effect won't let any section exceed 0db. Obviously, this can be real problem if the rest of your recording is down around -18db.

Theoretically, the compressor should remove any such spikes before it applies the volume increase. (Removing those peaks is, after all, its job.) Unfortunately, it sometimes doesn't. In my experience, it can miss very sharp and sudden peaks. The solution is to find any remaining spikes and eliminate them yourself.

Select the Envelope tool (the hourglass-shaped icon in the top left corner). In the waveform, find the first sudden peak that stands out. Click to create control points on either side of the peak. Then, click and drag the points to reduce the volume of the peak. Dragging up increases volume; dragging down decreases it. You can create as many control points as you need to fine-tune the adjustment. When the peak sits level with the rest of the waveform, move to the next and repeat the process.

Amplify to set a new ceiling

With those leftover peaks out of the way, it's time to increase the volume one final time. Select your entire recording, and click Effect -> Amplify. The Amplify window will automatically calculate how much the volume can be increased without breaking 0db and distorting. This is an important final step for most amateur podcasters, since our recordings tend to be on the quiet side.

If the result isn't loud enough, it's probably because the recording still contains unwanted peaks. If you don't have time to smooth them all out with the Envelope tool, you can try running the Amplify effect again, but this time uncheck the box next to "Don't allow clipping." This will allow you to drag the decibel slider in the Amplify window as high as you like. Existing peaks will be pushed above 0db and badly distorted. I recommend this only if those peaks are extremely brief in duration or if you're too hurried to manually remove peaks with the Envelope tool. Also, don't stray too far above the slider's recommended setting or the entire recording will distort.