Really, truly understanding compressors

I have read and watched many, many explanations of compressors over the years and even made my own, but the thing I find lacking in most of these explanations (including mine!) is the why. Let’s try to fix that. Today I’m going to talk about not just what each of the controls do, but how each of them affect your sound and how best to use them as a content creator.

Threshold and ratio

Just to back up a bit, a compressor is a tool that controls the dynamic range of an audio signal, dynamic range being the difference between the loudest and quietest parts of the signal. It does this by reducing the level of the loudest parts of the signal, as determined by the threshold and ratio. These are the two most important controls, but also the easiest to understand. To put it simply, threshold controls which parts of your signal will be compressed and ratio controls how much compression is applied. A lower threshold means more of the signal will be affected while a higher ratio means it will be compressed more.

For example, if you set the threshold and ratio high (let’s call it a -10 dB threshold and a 5:1 ratio), then you will not be applying compression very often, but when you do, you will be applying a lot of it. In essence, this is what a limiter is. This could be useful if you don’t like the compressed “broadcast” sound or just generally don’t need to control the dynamic range much, but you still want to make sure that you don’t clip because of more extreme transients (the loud spikes at the beginning of sounds, especially words with hard consonants).

Waveform with no compression — No compression applied. Those big peaks are the transients, yikes!

Conversely, maybe you want to set the threshold and ratio lower, say -30 dB and 2:1. With this, you would be applying compression to most if not all of your signal, but you would be applying less of it. Because of how ratios work, though, you’d still be applying quite a bit to the loudest parts. As a quick bit of math, a peak at -5 dB would be reduced to -9 dB in the first example, but it would be reduced to -12.5 dB in the second example. This means that you can still control those sharp peaks, but you’ll also be evening out the rest of the signal for an overall more consistent level. Some people don’t like this compressed sound, though, as it is generally less natural sounding.

As a side note, some compressors have a compression control instead of a threshold control. These are functionally the same; turning up the compression simply lowers the threshold. The end result is the same, it’s just important to know which direction you need to turn the knob.

Attack

Attack controls how long it takes to reach full compression after the signal level exceeds the threshold. I’m going to reiterate that because I have had this wrong for many years: attack is how long it takes to reach full compression, but compression starts to kick in as soon as the threshold is crossed. Simple enough, but why would you want a short or long attack time? Generally speaking, a longer attack time will sound more natural. This is because a longer attack allows some of the initial transients through. On the human voice, for example, 5-15 ms is usually long enough to allow spikes from hard consonants to pass unaffected, which sounds more natural.

However, because a longer attack time allows some transients through, it is less effective at preventing clipping. This is why most limiters have very fast attack times, sometimes under 1 ms. You might think that you should just always use a fast attack time, but such a setting does not usually sound very natural (and sometimes sounds downright bad) and can also introduce distortion, particularly when you are trying to apply a lot of compression. Some compressors will handle this better than others, so unfortunately, this is a control that you will just need to experiment with to find what works best for you.

Waveform of compressed audio with fast attack time — Compressed with a 0.05 ms attack time. Much better!

What if you want the best of both worlds? A very common trick is to use two compressors, one with a higher threshold and ratio and a very fast attack to catch just the high peaks and another with a slower attack and lower threshold and ratio to even out the rest of the signal. Elysia compressors also have an “Auto fast” setting that allows the attack to normally be set higher, but automatically speed up as needed.

It’s also worth mentioning that the attack time is further modified by both the type of compressor and the knee setting. I’ll discuss those more later, but just know that some compressors are capable of faster attack times than others while some have a smoother attack curve than others. This is why recording studios have traditionally had many different compressors on hand, though in the digital age, a single plugin is often capable of many kinds of compression.

Release

Release time is basically attack time in reverse: as soon as the signal level drops below the threshold, it will start returning to the normal, uncompressed level at a rate specified by the release time. Unlike attack, which does this on a more logarithmic curve, this is generally done in a linear fashion for release (though some compressors can also do it logarithmically).

In practical terms, the release time can control how natural or compressed the signal sounds, how much the sound “pumps,” and how much room reflections (“reverb”) you can hear. A very fast release time will technically result in an overall more even average signal level, but it will sound very uneven as the sound “pumps” up and down rapidly. Conversely, a very slow release produces a much more natural sound, but a less compressed sound overall. It works sort of like just turning the volume down generally rather than bothering with compression at all. The trick, then, is to find the balance between the two. You want to set it low if you want it to sound compressed or want a more consistent signal level, but if it’s too low, it may sound extremely unnatural and it can emphasize the ambient room reflections (which most creators do not want).

As with attack, you will need to experiment to find a sound that works for what you want to achieve. For my part, I have found 75-100 ms to be a good balance that avoids unwanted room reflections while also not sounding too compressed. However, like attack, release is also further modified by the type of compressor in use, with many analog compressors simply having fixed attack and release times.

Other controls

Knee, RMS, and compressor type/style

These controls are at least someone related, so I’m grouping them here. Knee is pretty straightforward. Think of it like adding a fade: normally, when you reach the attack time, the signal immediately cuts to the lower level; by softening/increasing the knee value, it becomes more of a fade from one level to the next. As you might imagine, this helps make for a more natural sound. Some compressors, especially digital ones, have controls for this, but many others simply have a fixed knee. It’s not something to worry about too much, but if you have a control for it, you might as well change it and see if you like it better.

The type or style of compressor, as mentioned earlier, can have a big affect on the overall sound of the compressor. These are typically named for their analog equivalents, so you might see names like optical, VCA, FET, or more descriptive terms like clean, classic, etc. These are all different technologies used in the analog world with varying consequences on the sound. An optical compressor, for example, uses an actual light and light sensor to determine the signal level, which means it reacts much slower to changes in level. They are often more natural and smooth sounding, but ill-suited for dealing with fast transients, making them a great choice for vocals (as long as you are mindful of the transients). FET compressors, on the other hand, use a field-effect transistor (FET) instead of light and are also typically feed-forward designs, meaning that they detect the signal level before compression is applied. This allows for much faster attack times, so they are very well suited for peak limiting. The legendary 1176 is a FET compressor, for example, and has often been used in combination with the LA-2A, an optical compressor, for vocals.

The RMS, or root mean squared, control on compressors like ReaComp is a handy way to approximate different styles of compressors. Essentially what RMS means is that the signal level is averaged over time when determining whether or not it is above the threshold. If set to 0, no averaging is done, so the compressor can react immediately to level changes. If set higher, the compressor will work more like an optical one, sounding smoother but reacting more slowly to level changes.

In general, I wouldn’t worry about these too much. If you are really curious, researching the different types of analog compressor would go a long way toward understanding what to expect out of each of these; otherwise, if they are options available to you, you might as well try them and see what you like.

Hold

Hold time is a delay before the release starts. With hold time set to 0 ms, the release time will start as soon as the signal goes under the threshold; if set higher, the signal will continue being compressed even after it goes under the threshold until the hold time ends. Only after the hold time has ended will the release begin, assuming the signal is still under the threshold. In general, there is not much reason to mess with this and most analog compressors don’t even have a hold control. As always, feel free to experiment, but this can generally be left at 0 ms and ignored.

Range/gain reduction limit

This control sets the maximum amount of gain reduction that can be applied. If you don’t want too much compression applied, you can limit it with this, but it many compressors do not even have this option.

Lookahead

As you might imagine, this allows the compressor to “look ahead” and see if the threshold will be exceeded before it has actually happens. Naturally, this adds latency equivalent to the amount of lookahead time, so you probably don’t want to use this when streaming. And for obvious reasons, it is only available with digital compressors.

Final thoughts

Over the years, compressors have become increasingly flexible and also offer a lot of visual feedback. Unfortunately, these added controls usually make them harder to use for newcomers. It’s easy to say, “Just play with the controls until it sounds right,” but without guidance about what to expect out of the controls, it’s hard to know where to begin. I hope that this guide has helped you understand not only what each control does, but also why you might adjust them.

Audio basics for streamers: a follow-up

It has been several years since I wrote my audio basics posts and published a series of videos on the subject and while they still have a lot of good information, I have also learned some things since then. Nothing that fundamentally changes anything I said previously, mind you; that’s all still good information. These are just some little tips and tricks I’ve picked up along the way that I thought I would share.

Mic choice matters

But probably not as much as you’ve been led to believe. You’ve probably heard, even from many intelligent audio producers, that dynamic mics are less sensitive than condenser mics (which is true) and that means they are better at background noise rejection. They are not. If you take two mics with vastly different sensitivities and match their output level, all sounds entering the front of the mic will have exactly the same output level. Sensitivity is not magic, it only measures the output voltage of the mic when presented with a given input level.

Notice I said the front of the mic, though. That’s because the relevant factor here is not the sensitivity but rather the polar pattern. Just because a mic is cardioid does not mean that it is the same as other cardioid mics. Cardioid is a very broad category, so let’s look at two examples.

Here is the polar pattern graph for the SM7B. If the image doesn’t load for some reason, it was taken from the official user guide, available here. This is a cardioid polar pattern (that refers to the heart shape), but you’ll notice that as the frequency increases, the polar pattern tightens. That is, the mic becomes increasingly directional at higher frequencies. This is useful because room reflections tend to occur mostly in the higher frequencies, so this suggests that it is good at rejecting those reflections that enter the mic from the side.

This is the polar pattern graph for the Shure MV7+, also helpfully taken from Shure’s web site (thank you, Shure, for having useful and convenient specs). Unlike the SM7B, you’ll notice that the polar pattern does not narrow at higher frequencies. You’ll also notice that, like many other cardioid mics, it picks up a bit of sound behind the mic. This means that not only will it pick up more reflections from the sides but also from the rear. This is especially problematic if you position the mic in front of your monitor as it will pick up the reflections of your voice bouncing off your monitor.

Now, does this mean you should go out and buy an SM7B? Probably not. It just means that you may have more work to do when battling room reflections and that you will need to be cognizant of the polar pattern when positioning your mic. For example, don’t put the MV7+ directly in front of your monitor or have the back aimed at your keyboard if you don’t want people hearing that. You will also want to consider how the polar pattern interacts with any acoustic treatment in your room (or lack thereof).

Mic positioning matters

And not just regarding the polar pattern, as mentioned above. Mic position is something you will need to experiment to figure out what works best for you. One of the reasons people believe the SM7B magically removes background noise isn’t because it does but instead because it enforces good mic technique. It is designed in such a way that people want to be close to it. Not only that, but because it’s now a status symbol, people want to show it off, so they aren’t trying to hide it off camera. But you can do the same thing with any mic! Put it close to your mouth, 2-3″ away, and you’ll find that those room reflections start disappearing. This is because your voice is now significantly louder than those background sounds. The closer the mic is to your mouth, the better this signal-to-noise ratio (SNR) will be. Don’t be afraid to have your mic on camera!

The downside of bringing the mic that close is that it might get in your way and plosives might be more of a problem. In that case, experiment with moving the mic off to the side or down below your mouth (or both). Make sure you record some tests! As you move the mic around, you’ll find that the resulting tone changes. You may like the changes or you may not, so find a good position that balances SNR, tone, and plosive rejection.

Interface selection

There are honestly only a couple of things that really matter when it comes to an audio interface:

Are the preamps flat/neutral?
Do the preamps have a low noise floor?
Are the drivers reliable?
Is the headphone amp flat/neutral?
Does the headphone amp have a low noise floor?
What software features are included?

Most of those things are technical specifications that are pretty easy to look up, just be aware that many companies have misleading specs that really don’t tell the whole picture. Check independent reviews, especially once with technical measurements, to be sure. However, the software features can be very relevant for streamers. I don’t care much for onboard signal processing (DSP) or for a software mixer utility that has that since all of that can just be done in OBS, but many interfaces include software mixers with additional virtual outputs that can completely negate the need for utilities like VoiceMeeter. Audient (depending on the interface) and Elgato both have these features, but they are not alone. This is handy because it means I don’t need any additional software, they aren’t adding much (if any) latency, and there’s no quality loss. I like to use these to route my alerts and stream music to outputs 3-4 so I can hear them and they go to the stream, but they don’t go to my recordings. You could also use them to route voice chat to a separate channel for mixing. If your interface doesn’t have a virtual mixer, you can still use VoiceMeeter, so don’t feel like you need to buy something new, but if you’re in the market for a new interface anyway, you might want to look for this feature.

Gain staging and levels generally

I said in my previous post on setting levels that you want to set your preamp gain such that your loudest possible sound doesn’t clip and that is still true. What this actually means for you will depend on how dynamic you tend to be. If your voice doesn’t vary much in volume, you can set the gain higher and your post-processing workflow will be a bit easier; if, like me, you are all over the place, you’ll have to set your gain lower and do more work to level it out later.

However, when you are checking your levels to make sure they don’t clip, there is an important quirk of OBS that you will need to know. I won’t get too technical here, but the short version is that when you downmix to mono (which you should), OBS will compensate by lowering the level by 6 dB. This is to account for the increase in level that would occur if the same signal was on both the left and right channels, but for most of us using a mono mic, there is no signal on the other channel. The end result is that the mic’s output level is 6 dB lower than it should be and will never appear to clip in OBS even when it is clipping at the mic preamp stage. So when you set up your mic in OBS, in the advanced audio properties, set the balance all the way left (assuming you are on the “left” input), check the mono checkbox, and then go to the filters and add a gain filter with a 6 dB boost. All other processing will happen after this boost. Now you will see correct levels in OBS and can accurately gauge whether or not you are clipping.

Compression: sometimes more is less

As I mentioned above, I have a very dynamic voice. Most of the time, my input level is around -20 dB (or lower), but if I get particularly excited, I will hit all the way up to -0.5 dB. As a result, maintaining a consistent output level without sounding overly compressed has been a bit of a challenge. If you find yourself in the same boat, let me introduce you to the concept of… just adding more compression. Sort of. Sometimes.

One way to deal with the problem is to just crank up your compressor, but this tends to result in an unnatural sound that affects the entire signal. You may also find that no matter how much compression you add, you really can’t tame those extreme peaks. The solution is to just use two compressors, one to tame the extreme peaks and the second for general compression (and technically a third that is a limiter, just in case). As with all things, the following settings are just guidelines; you will need to adjust them for your own situation.

The first compressor comes after the 6 dB boost, noise gate, and EQ, and is intended to tame the highest peaks. In my case, I set the threshold to -17 dB, the ratio to an aggressive 10:1, and the attack as fast as it will get (0.005 ms for the compressor I’m using). There is no makeup gain on this compressor. The threshold was set so that my normal speaking does not hit this compressor at all, only the parts where I am yelling are affected. Having a fast attack means that the spikes from hard consonants can’t get through and the 10:1 ratio means that these very loud sections get squashed down to be more in line with my average signal level.

Now that the extreme peaks are tamed, I run the signal through a second, far milder compressor. This one is set with a -36 dB threshold so that nearly everything is affected, a mild 3:1 ratio, and an attack of 8 ms to intentionally allow hard consonants to go through as that sounds more natural (this is a trick I only learned recently). I then boost the signal by 19 dB to bring it up to a reasonable listening level (in OBS, this is toward the upper end of the yellow meter segment).

Finally, the signal goes through a hard limiter set to a -1 dB limit so that if all of that compression is somehow not enough, the signal still cannot exceed -1 dB (the recommended ceiling for most digital content). This probably sounds like a lot of compression, but keep in mind that most of the time only the second compressor is engaged. The end result is a very even output level that still sounds fairly natural.

That’s it!

Hopefully these are some useful new tips for you. Like I said, nothing about the original posts or videos has changed, but I have found these to be helpful tips to consider as I navigate my own streaming journey.

I was wrong about the SM7B… sort of

I have always said that you don’t need to buy an expensive mic to sound good on stream and I still stand by that, but for a long time, I really didn’t think the SM7B was particularly special. I thought it was mostly hype, bragging rights, and really more about the look than actually being a good choice for streaming. It’s big, it’s expensive, it requires a lot of gain, and it just doesn’t sound great without tweaking… or so I thought.

Why get one?

I have used (and still own) many different mics, but the one that I settled on as my streaming mic is the Neat Worker Bee (the original model). I like this mic because it’s cheap, it sounds great, and I like how it looks. For my voice in particular, I can use this mic without any EQ applied and it sounds almost exactly how I want it to sound. The first few videos in my Minecraft series use this mic and I was using it for streams, but I ran into a few issues that got me thinking about an upgrade:

My entire house has hardwood floors and effectively no insulation, so sound bounces around a lot. Even with 12 Elgato Wave panels on the walls, there’s audible room reverb, especially with the door open so the sound can escape into the hall and other rooms.
When I’m streaming Phasmophobia and playing with my wife, whose office is next to mine, even with a noise gate the mic will pick her up.
My mouse, which I like a lot and don’t want to swap out, has fairly loud clicks that the mic picks up. I waste a lot of time trying to edit them out of videos.
This probably seems stupid, but the Worker Bee’s capsule is recessed compared to the body and when in the shock mount, it is set back even farther. This means that my chest is constantly bumping into the shock mount just trying to get close enough to the capsule to sound good.

Given how much I like the sound of the Worker Bee, I was really loathe to buy anything else only to have it sound worse (as many other mics have). But the SM7B would seem to solve all of my issues, according to internet legend, so I started doing some research. I don’t think it makes a lot of sense to buy things I don’t need, so I wanted to be sure. Before taking the dive, I pulled out my trusty SM57 because, as it happens, it is easy to EQ an SM57 or SM58 to sound “close enough” to an SM7B. I learned that it sounds nothing like the MV7, which I had previously purchased thinking it would be a cheaper SM7B, and really nothing like I thought it would generally. This also taught me that while I can EQ those mics to sound like an SM7B, without EQ, the SM57 is far too bright for my taste. Knowledge acquired, I decided to give it a try. Here’s what I learned.

Findings

First, a lot of people describe the SM7B as sounding terrible without processing, but I don’t find that to be true at all. I guess I just like mics with a flat frequency response. It is, unsurprisingly, not as bright as the Worker Bee (which is a condenser with a slightly boosted top end), but a slight 2 dB high shelf boost at 3.5 kHz makes it sound quite nice. What did surprise me is that I also wound up boosting the low end just a little bit (about 1 dB). I assumed it would be overly boomy or muddy, but it really isn’t. Sure, I can still get right up on it to make it boomy, but at a normal distance, it’s quite flat. It may not get you the exaggerated broadcast sound you’ve come to believe is desirable right out of the box, but it’s a great starting point for crafting your own sound.

Second, if you read around the internet, you will discover that the SM7B is somehow magic at completely rejecting all room reverb and that this is due to it being a dynamic mic with extremely low sensitivity. It is a dynamic mic, but it is neither magic nor does that (or its sensitivity) have anything to do with it. Dynamic mics generally have thicker diaphragms, which means that they move less in reaction to sound waves, which means they output less signal and have lower sensitivity, but all this means in practice is that they require more amplification. It also (generally) means that they are worse at picking up high frequencies. And yet the SM7B is, seemingly, good at rejecting room reverb. Why? I suspect it has to do with the polar pattern and, in particular, the way the polar pattern changes depending on the frequency (you can view the graphs on this page). The higher the frequency, the more directional the mic is, becoming essentially hypercardioid at frequencies above 6 kHz. Most of what you hear from “room reverb” is in the higher frequencies, so when you have a mic that is both less sensitive to higher frequencies and that rejects a lot of off-axis high frequencies, you get less reverb. In contrast, the Worker Bee has a very wide cardioid pattern at all frequencies, hence why I struggled with it. However, it’s big brother the King Bee has a polar pattern very similar to the SM7B, as do plenty of other, cheaper mics. Interestingly, the SM57 and SM58 do not have narrowing polar patterns like the SM7B, so they may be more susceptible to picking up room reverb.

There is also a cost to this in the form of off-axis coloration. With the Worker Bee, I can angle it 45 degrees off the corner of my mouth to deal with plosives while still getting great sound; with the SM7B, if I do this, there is a noticeable (to me) change in tone. The more I angle the mic, the duller and muddier it becomes (though the change is subtle). This is not necessarily a problem as you may prefer that tone, but it’s something to be aware of when positioning this (or any similar) mic. In my case, I find that keeping the mic in front of me but angled up at my mouth provides the best sound.

Third, yes, it is a low-output mic, but most audio interfaces these days have preamps with enough gain to handle it just fine. Unless you have particularly noisy preamps, you are unlikely to need any sort of booster. I do have a FetHead (originally purchased to solve a noise issue with the Revelator io44) and I did end up using it as the EVO 8 interface I use has a bit of noise at high gain. It’s unlikely to be an issue in a streaming setting, but I had the FetHead, so why not?

Finally, and least importantly (probably), is the form factor. As I mentioned, the Worker Bee’s shock mount was constantly bumping into my chest. I could’ve moved it up and angled it differently, but then it would be in the way of my monitor. The SM7B attaches to the mic arm with a yoke in the center of the body, with the capsule sticking out a few inches from there. That allows me to keep the boom arm and the mic body out of my way while still getting the capsule close to my mouth. It needs to be said that an SM57 would accomplish the same thing, but I already knew I didn’t like the tone of that mic.

Final thoughts and recommendations

Before you ask, yes, I am keeping the mic. But should you get one? Probably not, at least not right away. I am a firm believer that you shouldn’t invest a lot of money (and this mic is a lot of money) in your stream until you are certain you want to keep doing it. You should wait until you have enough income from the stream to pay for the mic and then, if you still want it, go ahead and buy it. There are plenty of cheaper ways to get good audio, as covered in other posts here, in my videos, and in plenty of other videos. And yes, I realize I’m being a hypocrite, but I have a full-time job that pays me well, so I can afford it. But it is a good mic and just applying EQ to a cheap mic will not magically turn it into an SM7B. There are many factors that contribute to its legendary status and I now understand why it is a worthwhile purchase.