top of page

RECORDING : About Hearing 1.0

About Hearing 

© 2000 by David Moulton

Why Do We Do This Stuff?


We’re all in this crazy business because we love music.  And most of us who have gravitated into the recording part of this crazy business have done so because we are similarly hooked on sound.  We really and truly get off on the stuff.  We like what it does to us. We wallow in the sensory luxury of the really spectacular sonic magic that comes out of loudspeakers.  

Speaking for myself, I have always been fascinated by the “sound character” of all sorts of environments, machines, and other incidental noise sources, as well as music instruments.  It seems to me that I mostly perceive things more in terms of their sound than their visual appearance or odor.  I have occasionally even toyed with the idea of writing a detective novel where most of the descriptions would be auditory rather than visual, as in (in my first noire attempt): “After she slammed the door on her way out, I sank into what was left of stillness in my harshly echoing office, until beneath the rising surface of quiet I could once again hear the insistent rush of traffic below the windows, occasional horn stings and the rapid shrilling of a prowl car bulling its way through pedestrians, red lights and a gridlock as tangled as my mind.  The reverberations of her anger slowly tailed off into urban night noise, Fiona Apple in my inner ear with a passing boomvan keeping time.”  Oh well.  Even though Spenser has nothing to fear from me, I hope you get the idea.  Things and emotions can be perceived in terms of how they sound.

The point is, I hear my world as much as I see it, and usually I am acutely aware of how it sounds at a perceptual level that seems to me to be more conscious than what the average person experiences. There was a stretch, for instance, when I first really got into reverb, that I felt like I spent all my time hearing the spaces between notes and had the damndest time keeping the notes themselves in mind.  Clients would be moaning about distortion on the guitar track and all I could hear was the really interesting spatial double decay of the snare hit from the combination of the overheads and the guitar mics!  This is, of course, one of the curses of being a recording engineer.  

I suspect that for many of our end-users, much of the emotional impact of sound is mostly pre-conscious and that they are not as aware of the effect that the sound of a given situation is having upon them.  And that has some interesting implications for people in our line of work, which I’ll get to in a minute.  Meanwhile, I also suspect that many of you are like me, or else you wouldn’t be reading this.  All this sound stuff, these emotional meanings that sound has for us, are central elements in our recording craft.  Mostly we take all this for granted.  

The reason I’m telling you all this is to illuminate just how clear, powerful and effective our sense of hearing is.  But what I really want to discuss is how that hearing system works, and how it is that it works so well that we aren’t even aware of it working.  And once we know this stuff, we can really expand our craft as recording engineers.  Avanti!  

So How Do We Hear This Stuff?

Our perception of sound is so easy, so seamless, so unequivocal and clear, that it doesn’t seem like there is much to it.  Guy/gal makes a cool noise, we hear it.  Cool!  What could be simpler?  

Naturally, when we try to make a recording of said guy/gal’s sound and play it back out of loudspeakers we run into a little trouble, as you may have noticed.  Guy/gal makes a cool noise, we record it, play it back.  Not quite so cool.  Why is that?  The obvious answer that most of us like to fall back on is that our equipment isn’t good enough.  And so there’s a lot of blather about accuracy going around these days, and we worry about our gear.  We reason that if the gear was “really accurate,” why, it’d sound exactly like that soprano digeridoo we’re trying to overdub!  

There is another explanation, however.  What we perceive isn’t exactly what went in our ears.  In fact, it isn’t like what went in our ears at all!  What we consciously “hear” is far removed from the physical stimulus called “sound waves” that entered our ears.  And because there is such a huge metamorphosis between our ears and our mind, it isn’t reasonable to assume that just because we’d like to think that we’ve made a really physically “accurate” recording of that soprano digeridoo, that in fact we have made a recording that is accurate for our perception.  

Just because we’ve used “really accurate” microphones, consoles, recorders and speakers doesn’t guarantee a whole lot, it turns out.  We need to consider our hearing mechanism in a little more detail, and understand a bit more about what it is doing.  And, along the way, perhaps we need to redefine what we mean by “accurate.”  

So, I’d like to devote a couple of articles over the next several months to the human auditory system and how “the way it works” affects our work as recording engineers and producers.  As I mentioned earlier, most people get their emotional hits from music and sound at a pre-conscious level.  And that gives us some powerful mojo.  If we can get control of the raw sonic materials that generate those emotions, why, we can really get our listeners going and they’ll never even have a clue as to why!  Maybe we can even come to rule the world!   

So What’s Really Going On Here?

You all know the basics about hearing.  You know, for instance, about the holes on each side of our heads, with the funny-looking flaps and the microphones at the inner end of those holes.  The holes are called ears, the flaps are called pinnas and the microphones are called eardrums.  We also know that, somehow, the air pressure change detected by those microphones gets sent to our brain, and that what came into our TWO ears gets combined so that we can figure out where the sound is coming from.  Probably, you also know some other stuff, such as that the limits of our hearing run from 20 Hertz to 20,000 Hertz, that bats and dogs hear much higher, that cats hear softer, yada yada.  

If you’ve done your reading a little more carefully, you may know that the softest sound we can hear is called 0 dB Sound Pressure Level and that 120 dB SPL is the loudest sound we can stand, sort of.  You may have heard stuff like 1% distortion is inaudible, but also heard that there are guys ‘n gals out there that can hear .001% distortion.  You may even know the implication of those numbers.  

But what isn’t really discussed, or perhaps even considered, is how and why our auditory systems works the way it does AS A SYSTEM.  We gloss over the difficult bit about how the sound gets from our eardrums to our brain (“Like, it just goes there, man.  You know, like, through the nerves.  Yeah, that’s right, it goes through the nerves, which are just like Monster Cable!”).  Nor do we really consider how, or why, it evolved as it did, other than a tired line or two about needing to be able, as cave-persons, to localize the saber-toothed tiger just before being converted into deviled ham.  

The Auditory System  

So let’s look at the big picture (er, sound), for a second.  We’ve got the mechanical sensing system, called the ears.  Then there is a transducing system in the inner portion of each ear that converts the detected mechanical motion into neurological impulses.  These impulses get sent to a portion of the brain called the auditory cortex via bundles of auditory nerves (which deserve some serious consideration all by themselves) and a series of intermediate stages in the nervous system and brain.  During this transmission process a lot happens to transform the neural information that was sent from the ears. Auditory neural impulses from the two ears get integrated together (exactly how, we don’t know).  These auditory neural impulses are also sent to the central nervous system as information to act upon and react to.  Sensations of pitch, loudness and direction are extracted and/or derived from this auditory neural information.  Finally, the evolved and transformed neural information is sent to the frontal lobes of the brain for perceptual activities like speech processing, identification, memorizing, and conscious perception – all the easy fun stuff that we know and love so well and so much.  It is an extraordinarily complex system, and it does not yield to simplistic explanations about how we so easily and seamlessly perceive our beloved soprano digeridoo.  

As Observed By Zork-11 From Betelgeuse IV

So let’s look at it from another direction.  What is this system trying to accomplish?  Let’s consider it from the perspective of a visiting alien trying to figure this out.  First off, the physics of it are this: the auditory system is detecting the short-term pressure emissions given off by other organisms and the environment in general as a by-product of their regular activities, over a fairly broad range of frequencies and almost the entire linear range of pressures possible in the gas medium (air) in which we live.  The system permits us to detect, localize, identify and (sometimes) communicate with a few of these other organisms (we call them Humans and Golden Retrievers) in a three-dimensional space around us.  In addition, the system permits us to detect, localize and identify the environment as well.  

Think of it.  We live in this transparent gas called air, and we are really good at detecting short-term patterns of very slight pressure changes in this gas over a huge set of ranges.  And not only do we use this pressure-variation detection ability to determine what is going on around us, we also make up and generate little pressure-variation patterns in this gas just for the fun of it (which we call music)!  And we exchange precious metals between us in return for the fun of detecting such “cute” patterns of gas pressure-variation!  Whoa!  

Zork-11 is impressed!

A List Of The Kinds Of Stuff We Hear

Let’s make a list.  

We detect stuff happening around us, from all directions, all the time.  

We quickly and easily identify the nature of the stuff happening, to a point where we can casually tell the difference between such small differences as the pressure-variations generated respectively by the footsteps of Igor and Samantha, as they walk around in the same room we are in.  

We quickly (and mostly subconsciously) detect the presence of environmental features of our space, like walls, ceilings, etc., even though they don’t generate gas pressure variations.  We do this by detecting the REFLECTIONS of the pressure variations generated by the footfalls of Igor and/or Samantha.  Amazing!  

And when either Igor or Samantha (not their real names, by the way) generates pressure-variations using the cute-looking orifice located at the center of their respective heads, we have learned to quickly come to understand that it is time to (a) put out the garbage, (b) get the car fixed, or perhaps on a good day (c) join Igor or Samantha in ingesting a liquid derived from fermented fruits and/or grains.  We call this “communication through the use of spoken language.”  It is a totally amazing trick!  

And finally, in a truly remarkable and almost totally incomprehensible way for our alien visitor and new friend Zork-11, those of us called recording engineers spend our lives fooling around with all these tiny gas pressure-variations for fun, and on a few rare occasions, for precious metals as well!  Like totally awesomely amazing!  Really!!  

And Now, Some Questions

Obviously, it is not the gas pressure variations that we are interested in.  It is, rather, a complex range of information and emotions that is CARRIED BY SUCH PRESSURE VARIATIONS that is what we are really interested in transmitting and perceiving.  

It is useful to ask some questions about that information, such as:

– What are the features that allow us to distinguish one sound from another?  

– How do we distinguish between the sounds of sources and sounds of reflections?  

– How do we extract the sense of multiple pitches from a single complex wave?  

– Why don’t we do this for all complex waves?  

– How come we don’t get hopelessly confused by all the reflections from the environment?  Come to think of it, how come we barely even notice them?  

– How can we recognize Samantha’s voice on the telephone, when it has been band-limited to a very small percentage of her original sound?  

– What makes it possible for us to “hear” that a room has been freshly painted, or that we’ve added a sofa?  

The fact that all these things occur so naturally and effortlessly in our perception obscures the complexity of the underlying system.  When we make recordings, the mechanical ears that we use, microphones, don’t have that same complex underlying system.  As a result, they lose a lot.  When we play the resulting audio signals back through loudspeakers, they lose even more.  Physically speaking, it is not a pretty picture!  That it works at all is miraculous.  

If we’re gonna be good, really good, at our craft of recording engineering it behooves us to get a handle on how us humans perceive this stuff, and then to do what we can to make the sounds our loudspeakers put out as useful, informative and entertaining as possible for the human auditory systems possessed by our clients and their fans.   

This involves some hard thinking.  Most of the operation of the hearing system is concealed from us (that’s why I call it pre-conscious).  So we have to work on our ability to infer what is going on by observing the relationship between what we perceive and what we know by physical measurement is happening.  A lot of it, when we get into it, is pretty spooky.  

Hard thinking is mainly a process of asking hard questions.  For instance, the process of neural transmission from the ear to the brain isn’t instantaneous – in fact it takes something like 5 – 10 milliseconds!  So we’re always perceiving a delayed version of what happened.  Given that that is so, how can musicians play together?  And how come we don’t notice the delay?  

Another puzzler from the same dismal swamp: we don’t perceive the early reflections of a sound source in a small room for up to about 40 milliseconds – this is part of what is called the Precedence Effect.  How come?  Is this part of what we call masking, where one sound artifact conceals another?  And speaking of masking, did you know that under certain circumstances a sound can be masked by another sound that comes AFTER it?  How can this be?  

A BIG question from the realm of stereophony has to do with the phantom image.  How come there is one?  Why don’t we get phantom images from two violins playing the same note?  Why does this only seem to happen with loudspeakers?  Zork-11 doesn’t get it!  

How can we hear chords?  Why don’t we hear overtones as chords?  Why don’t we hear barometric changes (they’re gas pressure-variations too, you know)?  How come reverb doesn’t confuse the hell out of us?  How can we actually like the stuff?  Why isn’t an anechoic chamber the best place to play music?  

As we begin to pull the answers to these questions together, using what we know about the origins of the hearing mechanism and what it needed to do to help us survive in a Darwinian world of natural selection, we can begin to build up a little bit more robust and sensible understanding of what is really going on with our hearing, and how to use that knowledge for fun and profit.  

Next month, we’ll take a hard look at what I call the Audio Window, the physical ranges within which we perceive sound.  We’ll also consider how analog and digital audio are fitted to those ranges.  Ain’t science fun?

Thanks for listening.  


Dave Moulton is a recent Grammy nominee and author of Total Recording and Golden Ears.  

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page