Audio Cleanup and Editing: My process

I’ll probably clean up the exposition in this blog post at a later date, but I know there are a lot of people out there looking for information on what they should probably look into doing to create the best audio output after an edit as possible.

To start, if you happen to have control over your recording environment, there are a few tricks you can do to help yourself as an editor.

Syncing Your Audio

First and foremost, if you have more than one person talking and more than one track, you’ll want to find out what works best for everyone to sync their local recordings up to everyone else. For myself, I prefer everyone to hover over the record button as I count down “3…2…1…clicky”, where everyone clicks record on the “clicky” beat. Everyone’s audio should be within .5 seconds of one another at that point and easy to line up. Others wait for a certain time to show up on a clock and then all clap at the same time. This is also viable, though I’ve found it to be more work lining things up after the fact.

Sweet, Sweet Background Noise

After everyone’s recordings are going and synced up, you will need to gather some background noise from each microphone location so that you can easily cancel that noise from the entire recording (more on that later). The best way to do this is to have everyone remain quiet for 5-10 seconds all at the same time. No moving things around, no coughing, sneezing, drinking water, or anything like that. You want as clear a background noise as you can get on each track. I tend to even hold my breathe during this 5 count since my mic can sometimes pick up my breathing at times.

Proper Mic Etiquette

I’m not going to go too much in depth here about this topic, but proper mic etiquette is key to making your life easier as an editor. If people have a dynamic mic and aren’t talking somewhat near the mic, they might come across as quiet or faded in the end. If you have a condenser mic, like the Blue Yeti, it’s much easier to pick up all sorts of sounds in the environment, including taking drinks and sometimes breathing if you are too close to the mic. I highly recommend, especially for first time podcasters, to do a test run of your setup. Record about 10-15 minutes of conversation or discussion, listen back to it, and make adjustments as necessary. This could save you tons of time in the editor’s room later.

Single Track vs Multi Track

Now that you have the basics down for recording, the decision is to be made on whether you’ll be recording everything on a single track or splitting the sources between multiple tracks. This is highly dependent on your recording setup, but I highly recommend that if you are recording with everyone remotely, that each person records their own audio on their own computer, and then you combine those individual tracks into the same project for when you are doing your edit. Some people record everything in person, and this is where things get tricky. Some devices are really great for this, like the Zoom H6, which lets you record multiple sources on their own tracks while also taking a single source backup track. Some devices simply combine a bunch of sources and then that plugs into a computer, which then counts as one input source of audio, which means you’re only getting a single track of everyone’s combined audio. This isn’t inherently bad, it just means you lose a bit of control over your tracks, which I can go over at a later time.

Audio Backups

Even if you perfected your recording setups, everyone does everything right, and your audio sounds super great all around, sometimes things happen that are beyond your control. What would you do if you had a power outage and no battery power for one of the tracks? That track might be lost for good as that computer shuts down. What if someone tried to upload their audio for you, thought it uploaded, then deleted the source files? You are going to need backups of some sort. If you are recording remotely, this is really easy. If you want a good free option, until the service goes away, you can go to YouTube, sign up for an account, and create private live events there that you can then record. This will record not only the audio but also the video. It all ends up on a single track, and some post processing is done by YouTube after the recording is finished, but this has been a lifesaver for me in the past.

If you have the money, consider investing in something like Zoom. This is a video conferencing service that is able to actually record everyone’s audio on their own tracks as well as one combined track. The only problem I see with this is that if someone’s internet drops them from the call after the recording starts, Zoom creates a brand new track for them when they rejoin, which will make aligning the audio from the backup more tricky (unless you use the track with all of the sources combined).

It should be understood that this backup won’t be as good of quality as the local recordings, but having sub-par quality audio is definitely better than having no audio at all, especially if you can’t re-record the lost audio easily.

Finally, Donning the Editor’s Hat

Now that we know what we can do ahead of time to save some headaches in the editing room, we can finally dive into my methods for cleaning up audio and getting it ready to go into production.

Cleaning That Sweet, Sweet Background Noise

Some programs are better than others at this step. From what I hear, Audacity has one of the better Noise Reduction effects you can use, and that is free. It is what I use, and it makes my life much easier. But the main concept is really easy to understand. Find that 5-10 seconds of silence you recorded earlier. See how it’s not exactly silent on each track and each track looks a little different? That’s because each room, each microphone, each location has a different baseline of background noise in it. So, the big trick here is to grab that background noise of silence for each individual track, and reduce that noise out of the entire track. I’ll have a better article about it later, specifically for how I do this in Audacity, but for now this article assumes that you know at least the basics on how to do this.

Do you still see relatively constant squiggles in that silence? Try running the process a second time! That should clear out the remaining bits of noise from the rest of the track. I wouldn’t recommend cleaning more than twice, as that might start to degrade the quality of the actual audio we care about.

Noise Gating

After getting all the background hums and whirs out of your recording, now you’ll want an easy way to get rid of the little clicks, ticks and icks that sometimes show up throughout a recording process. To do this easily, your best option is to employ something that is called a Noise Gate.

A Noise Gate basically searches for isolated audio chunks that are quieter than a certain threshold and then silences them (or greatly reduces their volume, depending on your need). For me, I make sure to gate all of the tracks to remove anything that is consistently below -40.0 db. This is generally well below normal talking loudness in well recorded audio, so I don’t worry about it getting rid of anything that I might still need for the edit. I also ensure to tell the gate that I would like a full second on either side of audio that goes above that threshold. What that means is that if 0.5 seconds of a quiet sound happens right before a person starts talking, that 0.5 seconds will stay after the gating is done. This helps to retain some of, what I like to call, quiet starts to people talking.

Sometimes people will make naturally quiet sounds at the beginning, or especially at the ending, of words or sentences. Being too aggressive with a noise gate can easily cut out valid sounds that make it sound like people just stop talking suddenly, and it can be quite jarring.

It might take some getting used to, depending on the application you’re using this on, but this is a powerful time saving technique. It’s worth noting that Audacity doesn’t have this function built in and you need to download a separate plugin to get it working.

Lining Things Up

Now that the audio is (mostly) cleaned on each track, you want to make sure that the audio lines up correctly with one another. If you all started recording at the same moment, you might not have any work to do. If you all started recording at different times, make sure to line things up before you begin your actual edit. Just slide the full tracks around until the lineup markers (claps, snaps, etc) are all in the same location on each track. This should be an obvious step, but I’m including it here for completion’s sake.

Keyboard Shortcuts!

Ok, now we are ready to edit for content. Every audio editing software is different, but they all generally do the same things. It might be a good idea to get used to the terminology for the software you’re using, but this article will assume that you already know what tools you’ll be using for editing (cutting in place vs silencing). I highly recommend setting up shortcuts for ALL of the common things you’ll be doing during an edit. For instance, this is my setup:

  • “a” : Cut in place selected audio (meaning delete it and leave the rest of the track alone)
  • “Spacebar” : Play/Pause
  • SHIFT + “Spacebar” : Play-at-speed (meaning, it’ll playback the audio at whatever speed I have that set to in Audacity)
  • “F1” : Selection Tool (Selects audio with the mouse)
  • “F5” : Move Tool (moves detached chunks of audio around)
  • CTRL + ALT + “j” : Detach at silences (in the highlighted audio, anywhere that the audio is 100% silent, it will cut those silences in place, creating a lot of little islands of waveforms I can then move around separately)
  • CTRL + SHIFT + “e” : Export audio
  • CTRL + ALT + SHIFT + “e” : Export Selected Audio

And that’s about it. Everything there is all I need to do my basic edits. For the more advanced effects, I simply go to the effect manually from the menu. By doing this, I can keep my left hand on the keyboard at all times and my right hand on my mouse at all times. I never have to lift my hands off those positions, which means it’s less chance for repetitive stress injuries to my wrists and whatnot. Plus, these shortcuts have dramatically increased my productivity overall.

Ums and Uhs and Mouth Sounds™

One of the banes of most podcasters are the filler sounds that we all tend to make. Ums and Uhs tend to plague most recordings, but I wouldn’t fret too much about getting rid of every single one of them. The general rule that I use is that if an undesirable sound is completely isolated from other waveforms, just highlight that sucker and silence it or cut it in place. If an Um or Uh is connected to other waveforms? Might as well just leave it, because these sounds do tend to sound natural in conversation. It really depends on what you want out of your final product, really, but if you’re doing just a basic edit for an Actual Play or Discussion podcast, and not for, say, an audio drama or a highly produced AP, then it would be quite safe, and probably preferred, to keep those attached ums and uhs. If it’s not too glaring, you might even be able to keep some of the detached ones. It’s entirely up to how you want the final edit to sound.

On the other end of the spectrum is mouth sounds. These are little pops and clicks and squicks that can make a lot of people cringe with the shivers. Some people like them, and I’m not about to yuck anyone’s yum, but a lot of people do not like them. If people are too dehydrated when recording, every time they open their mouth, they will produce a little pop sound right before they talk. You’ll get used to everyone’s mouth sounds in waveform mode as you edit them more and more, so you’ll be able to see them before you even get to hear them, but it’s worth noting that you can spend a lot of time on getting rid of this annoyingly prevalent detail.

Deleting Can Be Dangerous

One thing to watch out for is highlighting a specific portion of one track out of your many tracks, and then hitting the delete key. What this generally does is deletes the audio at that point and shifts everything from the right to the start of your deleted selection. This will completely throw off your tracks being synced up.

To avoid this, you can make sure to highlight every track at the point where you want to delete, this way all tracks shift the same distance, keeping everything synced. Alternatively, just cut in place or silence what you don’t want, and you can automatically adjust the gaps of silence later.

Editing at a Faster Speed

Remember my Play-at-speed shortcut from before? This is a life saver. I can generally kick up my playback speed to 1.5-1.75x the normal speed, making everyone sound like chipmunks, but allowing me to get through the edit much faster. I must stress, though, this might only be effective if you are already used to the voices and the little quirks of their waveforms (what waveforms are garbage background noise or table noises vs what waveforms are actual sounds coming from the person themselves that you may want to keep). You have to be paying attention at these faster speeds, though, because things come up fast. I tend to edit a lot for quality of audio, so that means I’m looking for those garbage waveforms to silence. I will even highlight them ahead of time as the conversation flows, and then pause the playback after I verify that, yes, it’s bad audio that I can silence. Once it is silenced, I simply click where I left off and SHIFT + Spacebar to get moving again. It’s very quick.

Final Details

Now everything is just a matter of going through the motions. If something doesn’t sound quite right, slow things down to normal speed, listen to that chunk again, and isolate what’s wrong and silence it. Sometimes this could be weird pops, the twang of a microphone being bumped or whatever else. Sometimes this happens during someone talking, and there’s little you can do about it.

But if it happens right after someone talks, you are in luck. Just zoom in on the waveform where the odd sound happens, select the moment it starts happening but leaving as much of the talking as possible, and then silence the odd sound. What this might end up doing is creating a dramatic cutoff of that person’s audio. In that case, you’ll want to make use of the fade effect. Select the last fractions of a second before the cutoff point, then fade the audio out from 100% to 0%. This will give a more natural taper off curve of the person’s audio, and it’ll sound less like an edit and more like it was recorded that way.

That’s just one example of the odd things you will encounter from time to time as an audio editor. You’ll find a lot of it requires some finesse and creative use of tools you are already familiar with to take care of, so don’t be afraid to experiment. Just save your work before you do!

Trim Silences

Once you are done with your base edit, if you left all those silent portions alone and didn’t tighten up those gaps manually, you might want to consider letting your audio editor doing this work for you. At the very least, in Audacity, you just need to highlight your whole edit, select Trim Silences from the effects menu and set how big of a silence you want to keep. For discussion podcasts and certain APs, I have found 0.5 to 0.7 second gaps to be very effective in making the conversations pop. Sometimes this isn’t so effective, especially if you’re dealing with something like Actual Plays or Audio Dramas where you need to have a more orchestrated approach to the flow of the conversations. Sometimes gaps of silence can speak volumes, so maybe closing these gaps yourself would be a better option in those cases.


Finally, you are done with editing all of your talking tracks. All of the audio is in the right place and all the strange mouth sounds are gone. Now we’re ready to add in music and whatever else on top, right? Well, not exactly. First, you’re dealing with everyone’s audio being at different volumes and other technical audio terms. That might sound okay with headphones on, but once you move to a car or speaker on the phone or whatever, you may lose some voices here and there.

That’s where Levelator comes in. Levelator normalizes all of your audio so that it doesn’t clip beyond the -1.0 db point, but also makes everything at roughly the same loudness automatically. Seriously, it’s magical. You just need to export your talking tracks as one .wav track (32-bit is preferred), then you open up Levelator (which you can download from here), then you just drag your exported .wav file onto Levelator and then wait.

It’s that easy. Once it is done, it creates an output.wav file with the same name as your input file. Take this output file and import it back into your audio project and line up the beginning of this track with the beginning of all your other tracks, then silence all your other tracks and keep them there for reference (this is a personal preference). At this point, now you can add your intro music, outro music, whatever have you, and you’ll know that your talky talky audio will be extremely easy to hear in almost all environments and situations.

That’s the Basics!

And, that’s it for the basics. That’s my general workflow for editing audio and making it sound good in the end. This guide might be forgetting some details here and there, but if you know your way around a sound editor at least a little bit, it will most certainly help with your work flow. Speaking of, here’s a quick list of the main points above:

  • Remove Background Noise
  • Noise Gate
  • Line Up Audio
  • Edit Content
  • Trim Silences
  • Levelator
  • Final Touches