O'Reilly Network    
 Published on O'Reilly Network (http://www.oreillynet.com/)
 See this if you're having trouble printing code examples

Creating Great Audio for the Web

by Steve McCannell

Related Articles

Broadcast 2000 Brings DV Editing to Linux

Broadband Price and Serviceability Top Customer Concerns

Server-Side Considerations for Video Streaming

Previous Features

There are a number of reasons you may want to use multimedia with your business, whether it be for company-wide updates or to publicize a piece of music. Whatever the reason, you will need to know certain techniques and methods to get your media to your audience.

This article will take you through the steps and techniques used in the recording process, showing you how to polish your recording using the many audio manipulation software programs available, how to choose which format to present your material, and how to serve it to your users.

Recording 101

There are three things that you will need to record audio to your computer: a microphone, a reasonably fast computer with an input-enabled sound card, and an input source. Well, four actually; you have to know how to get your recordings to your computer. Take a look at the back of your computer casing and find where your speakers are plugged into. This is your sound card. You will see other empty jacks on the same card. You will need to plug your source into either the line-in jack or the mic (microphone) jack. Now open whichever audio recording software program you use (if you don't have a specialty program for recording, both Windows and Mac come with recording programs with their operating system.) Generate some signal to your sound card and press the record button. Easy, wasn't it? If you are having trouble hearing your recordings upon playback, chances are you have the wrong input source selected and just need to switch from microphone to line-in or vice versa.

One of my favorite quotes is "you can't polish a turd." Apply this to your recordings; you can only get out what you put in. So here are some basic things you need to think about before you hit the record button.

Input levels set too low Input levels set too high Input levels set correctly

Input level set too low

Input level set too high

Input level set correctly

Audio for video

Let's say your CEO is going to give a "state of the company" speech, which will be videotaped and distributed to the various arms of the company, and you are in charge of the recording. Seeing as the words being spoken are just as important as the video that accompanies it, you will probably want to think about how you will record the audio. The microphones that come with most consumer-grade video cameras are not up to the task; they tend to introduce noise from the camera motor, and speech is usually muffled. The best alternative for recording audio for video is to use an external microphone.

On your camera, you should have a mic-in port. By using this, you bypass the camera's internal microphone, and the input will come from your external mic. Choose a microphone that is up to the task at hand. If you are recording spoken word, you should almost always use either a handheld or a lavalier cardioid microphone.

Hear the difference

From the same vantage point, we recorded spoken word using three different microphones. Take a listen to hear the difference with each one.

 •Internal microphone from camera
Real Player  MP3

 •External omni-directional
Real Player  MP3

 •External lavalier microphone
Real Player  MP3

To prove this point, we recorded using three different microphones with a consumer-grade digital video camera. The first microphone we used was the internal microphone on the camera. Not only do you pick up the room ambience, but you also pick up camera noise and any noise that the videographer may make, such as a nose sniffle or cough. You also have to worry about your distance from the source. If you are too far away from your speaker, your audio will sound very distant (and zooming in won't help). If you're looking to focus on one or two subjects, I wouldn't recommend using this method.

Next we used an omni-directional microphone placed near the speaker. An omni-directional picks up sounds from every direction, so you'll notice that the room acoustics are easily picked up with this method, as well as footsteps and creaky floors. This method would be great in the middle of a discussion, but maybe doesn't quite work when there is only one subject speaking.

Finally we used a lavalier microphone with a cardioid polar pattern, attached at the speaker's collar. Cardioid microphones pick up sounds that are directly in front of the microphone's diaphragm, while sound waves coming from the side or behind are picked up at a significantly lower level. You'll notice when you listen to the recording that you do not pick up a lot of sound reflections from the surroundings, the speaker's voice sounds clear, and there is little to no background noise. This is due to the fact that the speaker's body blocks the reflections coming off the wall behind the speaker, and the microphone is closer to the input source, making the recording truer to the speaker's actual voice.

Typical polar pattern of an omni-directional microphone

Typical polar pattern of a cardiod microphone

Graphics courtesy of Audio-Technica


Audio Production Software

Here's a list of some of the many companies producing audio recording/editing programs.

Sonic Foundry - Products for beginners (Sound Forge XP 4.5, comes free with some sound cards) to advanced (Vegas Pro).

Syntrillium - CoolEdit 2000 for simple edits and processing; upgrade to the pro version for a 4 channel multitrack program.

Cakewalk - Wide variety of products for beginning and advanced audio professionals.

Once you have your audio recorded, you will probably want to do a little cleanup. The first thing to do is to play back your recording and check for any auspicious sounds or hiss. Cleanup of pops or audio spikes can be easily done by zooming in on the waveform and removing those sections that are creating the problem. Removing hiss is a bigger problem; hiss usually occurs because of poor recording equipment and/or the audio wasn't recorded at a proper level. There are a few things you can do. Some audio programs come with hiss or noise reduction plug-ins. I tend to steer away from these, mainly because they have a tendency to chop off a lot of the high-end frequencies and give your audio an unnatural sound. I prefer to use equalization (EQ) to remove any hiss. You can do this by using a notch filter (which raises or lowers a specific frequency that you choose) or a low-pass filter, which allows you to select a certain frequency as a "roof"; only frequencies below this roof are allowed to pass.

You may also have noticed that upon playback your recording does not have a consistent signal level to it; there are loud parts and quiet parts. Chances are that you will want to even these out by using some dynamics processing. Usually you will want to have an even balance between your high and low frequencies, which can easily be done using equalization. Once you are satisfied with the quality of the audio, you will probably want to compress it to make the output level more even. In the audio world, compression is not squeezing large amounts of memory into a smaller package; it is a way to turn the peaks and valleys of your audio wave into a wave with a consistent decibel level. This allows for the raising in volume of quiet sections and the leveling off of loud sections.

Presenting your material

Now that you've got your audio all ready for the world, you have to choose from the many formats that are used by different audiences. You will need to answer some of the following questions before you put your material up on your site:

Notes on streaming media

In order to stream Realmedia, your content should reside on a server that owns a RealServer license. (Note: You can serve from an HTTP server, but the HTTP protocol downloads files without regard to timelines, making clips with timelines more likely to stall). Quicktime presentations also should come from a Quicktime server. Luckily, you don't have to have both types of servers if you present both Quicktime and RealMedia presentations. Apple and RealNetworks recently announced that RealNetworks has licensed Apple intellectual property for streaming digital video and audio over the Internet in the QuickTime format. RealNetworks also announced that its RealServer 8 now supports the delivery of Apple’s QuickTime-based content to Apple’s QuickTime players and is immediately available for download.

At this point, you may be ready to make your streaming presentation available to the public. As I mentioned before, you can host your presentation on a normal HTTP server if you do not have access to RealServer. I don't recommend this for lengthy or complicated presentations, however, or for clips viewed simultaneously by large groups. If you are serving from an HTTP server, you will need to create a .ram file to point to from your link. This is easily done by opening up a text editor, entering the full URL of the media clip, and saving the file as filename.ram. Then just upload the .ram file onto your server. When the user chooses the link that points to the .ram file, the browser sends a request to the server. The server sends the .ram file, which causes the Web browser to launch RealPlayer. RealPlayer receives the .ram file and requests whichever file the .ram file is pointing to on the web server, and the content gets streamed to the user.

If you are using a RealServer, you will probably want to link to your files using Ramgen. The Ramgen feature automatically launches RealPlayer, eliminating the need to write a separate .ram file. Your web page URL simply points to your media clip or SMIL file on RealServer and includes a ramgen parameter.


With high-bandwidth connections becoming more prevelant, you can bet that web sites will contain more entertainment using multimedia presentations. Now that you know the basics, you should be able to create some media content of your own. The best way to learn is to experiment with your recordings; add some reverb, use different encoding codecs, find out what does what, and pick and choose the techniques that work for you. After all, hands-on experience is where the real learning begins.

Steve McCannell is a writer/producer for the O'Reilly Network and the founder of Lost Dog Found Music.

Related Articles

Broadcast 2000 Brings DV Editing to Linux

Broadband Price and Serviceability Top Customer Concerns

Server-Side Considerations for Video Streaming

Discuss this article in the O'Reilly Network General Forum.

Return to the O'Reilly Network Hub.

Copyright © 2009 O'Reilly Media, Inc.