Category Archives: Technology and Theory


Like Factoids, but not….



Audio technology is different than any other type of engineering I know of. In normal engineering, we start with a certain problem, and then try to invent a solution to this problem. But, in audio, most of the time, we start with an invention. Then we try to discover what it might be good for. However, just because one has a cool invention does not mean that the invention is useful or any improvement to the art.


Stereo Imaging. (Reply to an AK question.)

Technically speaking, “Stereo Imaging” is generally considered to consist of three components: localization, spaciousness and envelopment.

Localization: The positioning of each sound source in space. (The perceived width of the source itself is called, “localization blur.”)

Spaciousness: The perceived width of the sound field, including all sources.

Envelopment: The sense that the listener is immersed in a 3D sound field.

Each of these components depends on somewhat different factors. Also, the factors can often be somewhat mutually exclusive. For example, it can be difficult for a speaker that creates a sense of envelopment to excel at localization, unless certain very strict conditions are met. Like anything having to do with human perception, it is impossible to give a hard and fast hierarchy of importance to the factors that lead audiophiles to proclaim a certain stereo pair of speakers to be good at, “imaging.” Aside from the recording, (and any post-processing), which obviously has a make it or break it level of importance… how it was made and how it interacts with a given speaker and setup …here is a list off the top of my head:

– symmetrical location of the speakers with respect to the listener.

– the impulse response of the speakers.

– precise frequency response matching between the speakers.

– the angle subtended by the speakers WRT the listener.

– the presence or absence of nearby reflecting objects and boundaries, both to the speaker and the listener, along with their level of absorption at various frequencies.

– the acoustical reflectivity of the floor, and/or the vertical polar response of the speakers.

– the horizontal polar response of the speaker. Anyone who thinks that wires or amps, (etc.), have a significant effect on imaging is factually mistaken. The only component that can have even a tiny influence on imaging besides the speakers is the cartridge.

Forget Amp Clipping. It’s Over!!

#57 Today, 02:30 AM
ken kantor
Fowl Humor Join Date: Nov 2006
Location: Northern California
Posts: 1,703

Originally Posted by TerryS:

This question has been floating around for a few decades. I didn’t expect it would be answered this week (if at all). But I still enjoy the discussion.


The quesiton was answered decades ago. It’s just that some audio hobbyists don’t want to accept the answer, and go to great lengths to avoid doing so.

This is a pattern that emerges in various topics related to hifi. It would seem that audiophiles are rather stubborn, and also hold their individual perceptiopinions as infallible.

Playing dice with the Universe.

Last edited by ken kantor; Today at 02:33 AM.

Small Differences, V0.0547

A short AudioKarma post from this morning:

So true!

If only this stuff was as easy as, “Just Listen!” Or, “Measurements Never Lie!”

The reality is that firmly establishing or disproving a subtle audible difference can be a bit complicated and time consuming. However, the flip side is that a difference that small is unlikely to seriously compromise the enjoyment of the system.

Also, with speakers at least, I would estimate that 9 times out of 10, neither alternative can be definitively claimed as being, “more correct.”


Lightweight Article on a Heavyweight Subject.

A Ditty on Loudspeaker Placement.

Toe them outward slightly to help the image mesh into a continuum. It is also worth experimenting with the placement of one speaker by a few inches left or right. (If one is closer to a side wall than the other, this is the one to experiment with.) In terms of distance, you should ideally sit back roughly 1.3 times the distance between the speakers.

I realize the above sounds counterintuitive, but it will help. The root issue is the width of the human head, and adjusting to the proper multiples of this, so you can understand that small changes mean a lot.

NHT VT-2 Comments (Reply to AK inquiry.)

Flipping the switch on on the VT-2 does two things.

1- In “audio” mode,  the main lobe of the speakers is tilted inward, in order to partially simulate the NHT 21 degrees schtick. In “video” mode, the primary axis is directly forward from the speaker baffle, as is more conventional. This changes the direct/reflected ratio to give the audio mode more focus/sweetspot and the video mode more coverage/spatiousness.

2- In the “audio” mode, the crossover attempts to maintain a textbook transition between the drivers, leading to a very clean impulse response. In “video” mode, the frequency bands of the drivers overlap slightly. While having almost no effect on the frequency response or tonal balance, this makes the impulse response less sharp, and creates the kind of stereo imaging I believe is more appropriate for video use, with widened sources, less specific lateralization.

I always liked the VT-2’s personally. They had a solidity, dynamics and power handling that seemed to work well for rock music, good acoustic jazz and, of course, soundtracks. It was also a time that NHT was walking down new roads of exploration, treating the temporal and spatial response of speakers with as much thought and care as frequency response, to try and get control of aspects of the sound that were generally considered trial and error in a design.


Dummy Load #001

Acoustic Suspension Issues.

From the Madisound BBS, a few years ago. 

I have a somewhat different take on your question about acoustic suspension speakers than others that have replied so far, FWIW. I believe that most sophisticated designers are finally aware of the situations where acoustic suspension speakers would make the best choice, and where a different approach might be preferrable. Efficiency seems to have dropped off the average customer’s radar screen. Power is now so cheap, and the ultimate difference in max SPL between AS and vented is small enough, so that I don’t think this is a major issue these days. I can’t back that up with hard data, but it is common industry wisdom. Also, it is not as simple as using any woofer that happens to work in a sealed box. If the mechanical suspension is contributing most of the spring force, it isn’t AS. It’s more like an infinite baffle. To get all the characteristics of a proper AS design, the box compliance must dominate. How important this is depends on how good your mechanical suspension is, and how linear your motor is. (I’m not debating the merits, just clarifying the terminology.)

The problem is that manufacturing a proper acoustic suspension woofer is a real pain. Few driver companies are anxious to put them in their standard line. In turn, this means that a designer/brand wishing to make an acoustic suspension speaker has to devote extra effort towards engineering a full-custom driver, and paying the premium that this entails.

To summarize a complex issue, the high compliance of acoustic suspension woofers make them ill-suited to automated production. It’s difficult to handle very soft parts, keep surrounds in shape as you glue them, maintain positions exactly as the glue dries, etc. Also, on a production line, its not easy to rapidly test an Fs below 20 Hz. Data acquisition time and ambient vibrations are both the culprits here.

(After I chose to design the 1259 woofer for the NHT 3.3 as an acoustic suspension, it took a great deal of arm-twisting to get any of our suppliers to even quote on a driver with an Fs below 25 Hz. The lack of AS woofer availability was one reason I decided to make the 1259 available to DIY.)

There’s one more factor that I personally think inclines speaker designers away from AS. Let me try to explain:

With an AS design, there is only one, and only one, box that will lead to the target response. There is no way to tweak the response of the system to "tune" things after the fact. (Besides active EQ, and a very small effect from stuffing.) If the production driver does not match the prototype, or if the designer wants to fuss with the results at the last minute… too bad. This means that AS design is less forgiving of error or uncertainty. On the other hand, many designers like to tweak the bass response with port tuning, long after the drivers have been placed on order. Given a typical woofer production lead time of several months, this can be significant.

Sorry if this was too much information….


Re: Cording

Another Editorial from a few years ago.
High fidelity recording….  frankly, its a bigger can of worms than just about anyone wants to think about.  
Audiophiles take it for granted that because we have only two ears, two signal channels must theoretically be able to reproduce exactly what we hear, provided the channels are applied correctly.  Unfortunately, the two-ear/two-channel logic is false, and only occasionally leads to really good listening experiences.  The problem arises because human ears are not just fixed points in space, to be simply fed from fixed microphones.  Rather, human hearing works in a complex way, with the ears and head cooperating to understand the timbre and direction of impinging sounds.  In other words, our ears process the local sound fields near to them, not just fixed points; they are moving, intelligent, directional sensors.  These varying local fields must be recreated as functions of time in order to properly convince the hearing system of a virtual reality. 
In one very reasonable analysis, recording/playback is essentially a spatial-sampling problem.  For reliable reproduction, many sample locations (channels) are required in a given room.  Further, if one samples the recording room at an inter-microphone distance greater than twice the spatial Nyquist, as defined by the shortest acoustic waves one needs to reproduce, one gets stuck with spatial aliasing during playback.  (Not frequency aliasing, spatial aliasing… the existence of spurious artifacts in the geometry of the reproduced sound field.)  This is neither negligible nor trivial.  In fact, some of us believe it is an important underlying reason why recordings tend to sound non-live, and just don’t get much better as one tweaks the tonal spectrum, increases the dynamic range or lowers the distortion beyond reasonable levels of accuracy.  Add to this spatial aliasing problem the differing boundary conditions between recording and playback environments, and true “fidelity” becomes an extremely illusive goal. 
It’s not just difficult to achieve true fidelity, it is almost impossible to properly define it.  The anachronistic notion of a perfect microphone connected to a perfect speaker is quaint, it is easy to understand, but it is essentially flawed.  In reality, there is no perfect one-to-one mapping between a sound field and any finite number of V(t) electrical signals.  This renders all recordings spatially ambiguous. Transfer-function-like descriptions of recorded fidelity, (eg- “frequency response”), are non-operative when mathematically irreversible changes of signal dimensionality are involved, ie- when one or more microphones are used.  Microphones collapse four dimensions of information, [P(x,y,z,t)], into two dimensions of signal, v(t).  Transfer functions, of course, can be correctly applied to amps and CD players, and such, only because the mathematical dimensionality of the signal is not changed.  The 2-D signal remains a 2-D signal.  When ears, loudspeakers or microphones come onto the scene, one is forced to employ non-transform metrics of recorded quality.  I realize this flies in the face of some very entrenched ideas, but: the frequency response of a speaker is undefined!
It is possible to pick some arbitrary functions to map between the recording and playback space, while ignoring the inherently under-deterministic nature of these functions.  For example, the average power spectrum with such and such time constants, or an anechoic impulse response at one meter, can be optimized by design engineers, and we all call it a day.  The audio industry, of course, then argues ad infinitum about which of these functions (if any) bests fits the human cognitive process, ignoring how profoundly reductionistic they all are.  This is essentially the situation we have in the audio industry today.  In fact, it is the best of what we have today.  Unfortunately, many in the industry, and many of the industry’s consumers, don’t understand what a profound and inadequate compromise such “specifications” as frequency response or distortion really are, when it comes to capturing and perceiving acoustic events.  It is certainly better for all that objective specifications do exist and are used.  The flip side is that specifications become a tool of stagnation, even in the hands of the well-intentioned. 
So where to go next?  I have in the past called recording “lossy compression.”  We take an informationally complex signal, we ignore aspects of it believed to be non-essential, and we encode the remaining data.  That is exactly what happens in the recording process, regardless of the number of bits, channels or sample rate involved.  Compression starts at the microphone.  Even if the recording medium itself has perfect electrical fidelity, it can at best preserve only that fraction of the information that the microphones send to it.  OK, reductionism is always necessary in life, and it can be done in sensible ways.  The trick is to toss the right data based on an understanding of the perceptual system, instead of the current trial and error machinations that comprise “mic placement” and “speaker positioning.”   There are a few people thinking this way, working on acoustic “wavefield synthesis” and “auralization.”  Good stuff.  But far, far from the kind of record/playback schema that are presently entrenched.   
[2014SEP30:  Edited for minor corrections.]