Re: Cording

Another Editorial from a few years ago.
High fidelity recording….  frankly, its a bigger can of worms than just about anyone wants to think about.  
Audiophiles take it for granted that because we have only two ears, two signal channels must theoretically be able to reproduce exactly what we hear, provided the channels are applied correctly.  Unfortunately, the two-ear/two-channel logic is false, and only occasionally leads to really good listening experiences.  The problem arises because human ears are not just fixed points in space, to be simply fed from fixed microphones.  Rather, human hearing works in a complex way, with the ears and head cooperating to understand the timbre and direction of impinging sounds.  In other words, our ears process the local sound fields near to them, not just fixed points; they are moving, intelligent, directional sensors.  These varying local fields must be recreated as functions of time in order to properly convince the hearing system of a virtual reality. 
In one very reasonable analysis, recording/playback is essentially a spatial-sampling problem.  For reliable reproduction, many sample locations (channels) are required in a given room.  Further, if one samples the recording room at an inter-microphone distance greater than twice the spatial Nyquist, as defined by the shortest acoustic waves one needs to reproduce, one gets stuck with spatial aliasing during playback.  (Not frequency aliasing, spatial aliasing… the existence of spurious artifacts in the geometry of the reproduced sound field.)  This is neither negligible nor trivial.  In fact, some of us believe it is an important underlying reason why recordings tend to sound non-live, and just don’t get much better as one tweaks the tonal spectrum, increases the dynamic range or lowers the distortion beyond reasonable levels of accuracy.  Add to this spatial aliasing problem the differing boundary conditions between recording and playback environments, and true “fidelity” becomes an extremely illusive goal. 
It’s not just difficult to achieve true fidelity, it is almost impossible to properly define it.  The anachronistic notion of a perfect microphone connected to a perfect speaker is quaint, it is easy to understand, but it is essentially flawed.  In reality, there is no perfect one-to-one mapping between a sound field and any finite number of V(t) electrical signals.  This renders all recordings spatially ambiguous. Transfer-function-like descriptions of recorded fidelity, (eg- “frequency response”), are non-operative when mathematically irreversible changes of signal dimensionality are involved, ie- when one or more microphones are used.  Microphones collapse four dimensions of information, [P(x,y,z,t)], into two dimensions of signal, v(t).  Transfer functions, of course, can be correctly applied to amps and CD players, and such, only because the mathematical dimensionality of the signal is not changed.  The 2-D signal remains a 2-D signal.  When ears, loudspeakers or microphones come onto the scene, one is forced to employ non-transform metrics of recorded quality.  I realize this flies in the face of some very entrenched ideas, but: the frequency response of a speaker is undefined!
It is possible to pick some arbitrary functions to map between the recording and playback space, while ignoring the inherently under-deterministic nature of these functions.  For example, the average power spectrum with such and such time constants, or an anechoic impulse response at one meter, can be optimized by design engineers, and we all call it a day.  The audio industry, of course, then argues ad infinitum about which of these functions (if any) bests fits the human cognitive process, ignoring how profoundly reductionistic they all are.  This is essentially the situation we have in the audio industry today.  In fact, it is the best of what we have today.  Unfortunately, many in the industry, and many of the industry’s consumers, don’t understand what a profound and inadequate compromise such “specifications” as frequency response or distortion really are, when it comes to capturing and perceiving acoustic events.  It is certainly better for all that objective specifications do exist and are used.  The flip side is that specifications become a tool of stagnation, even in the hands of the well-intentioned. 
So where to go next?  I have in the past called recording “lossy compression.”  We take an informationally complex signal, we ignore aspects of it believed to be non-essential, and we encode the remaining data.  That is exactly what happens in the recording process, regardless of the number of bits, channels or sample rate involved.  Compression starts at the microphone.  Even if the recording medium itself has perfect electrical fidelity, it can at best preserve only that fraction of the information that the microphones send to it.  OK, reductionism is always necessary in life, and it can be done in sensible ways.  The trick is to toss the right data based on an understanding of the perceptual system, instead of the current trial and error machinations that comprise “mic placement” and “speaker positioning.”   There are a few people thinking this way, working on acoustic “wavefield synthesis” and “auralization.”  Good stuff.  But far, far from the kind of record/playback schema that are presently entrenched.   
[2014SEP30:  Edited for minor corrections.]
Post a comment or leave a trackback: Trackback URL.


  • Philosophil  On September 15, 2013 at 6:41 am

    Another problem, I suggest, is that many think of sound as if it were some kind of ‘thing’ or ‘object’ like a guitar or drum where the goal of a recording is simply to capture and replicate the ‘sound’ of that guitar or that drum. But sound seems to be much more complex than that.

    First, unlike ‘things’ or ‘objects,’ which we tend to think of as being where and what they are independent of their context or relations, sound seems to be inherently relational in that ‘what’ it is cannot be separated from the relations within which it is bounded and constituted. Understood in this respect, sound is inherently contextual and varies, literally, from context to context. This doesn’t mean that sound is not real, only that its reality is more complex than that of a simple, non-relational ‘thing’ (assuming it even makes sense to speak of non-relational things).

    Second, I would also suggest that sound, as something experienced, is also complex insofar as it is never experienced ‘directly’ as some kind of raw, primitive percept, but is instead the end result of a synthetic, ‘interpretive’ process. Thus what we call sound is always processed in some way. This introduces a plethora of complications for the naive audiophile, for it means that conditions such as psychological expectations can play a significant role in the experience of sound. This doesn’t mean that sound is reductively subjective, but it does make the task of distinguishing the ‘subjective’ and ‘objective’ elements of sound more problematic.

    Thanks again.

    • Ken Kantor  On January 27, 2014 at 11:03 pm

      Thanks for this. I fully agree!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: