Music, Not Sound: Why High-Resolution Music Is a Marketing Ploy

Tweet about this on TwitterShare on FacebookShare on Google+Share on LinkedInEmail this to someone

High-resolution music has been in the news over the past few days. Neil Young’s Pono, recently announced, is a new music player designed to play high-resolution music files.[1] Pono will also have a music store; users will be able to buy high-resolution music files and sync them to the Pono Player, in a process that could be as seamless as using iTunes and an iPod.

High-resolution music files cost more than other digital downloads, and cost more than CDs as well. But are they worth the money? Can you hear the difference between a CD and a high-resolution music file?

The answer is most likely no. While there may be a small number of people who have the necessary audio equipment and good enough ears to hear this difference, those people are few and far between. Most people cannot even tell the difference between a high-bit rate MP3 or AAC file and a CD, let alone a high-resolution file.[2]

But digital music purveyors market high-resolution music in an attempt to make purchasers think that they are special, that they may, indeed, be one of the few people who can hear the difference between CDs and high-resolution audio files.

So what exactly is high-resolution music? Why couldn’t it sound better than CDs? And why doesn’t it? You can’t test the subjective experiences of listeners, so how much of that experience is just an expensive placebo effect?

Some Terminology

For any discussion of high-resolution music, it’s important to clear up some terminology. When you see high-resolution music files, you may see them described as, for example, 24/96. This means the music in the files is 24-bit, and 96 kHz. While high-resolution music comes in a number of different levels of quality[3], I’m going to focus here on the most common high-resolution files, which are 24/96.

Let’s begin by explaining the specifications for audio CDs. The Red Book standard[4] specifies not only how CDs are manufactured, but also how recorded music is formatted for them. Audio CDs contained two-channel linear PCM audio [5] at 16-bit and 44,100 Hz; this is commonly abbreviated as 16/44.1. There are two elements here: the bit depth, which is 16-bit, and the sample rate, which is 44,100 Hz.[6]

Bit depth affects the dynamic range of music as well as the signal-to-noise ratio. The dynamic range of music is the difference between the softest and loudest parts of the music. A good example of music with a very broad dynamic range is Mahler’s third symphony. Listen to the final movement, and you’ll hear some very soft sounds as well as an extremely loud sounds. Or listen to Led Zeppelin’s Stairway to Heaven; it starts with a soft acoustic guitar and builds up to a fuzz-box crescendo.

The bit depth is essentially the number of variations a recording can choose from in a given slice of time. 16-bit audio allows for a range of 65,536 possible levels; 24-bit audio increases that to 16,777,216 levels. However, between the threshold of hearing and the threshold of pain, humans cannot distinguish enough of these volume differences for this to be noticeable.[7]

The second number in our pair is the sample rate: this is the number of “slices” of audio that are made per second, and are measured in Hz (Hertz). 44.1 kHz means that the music is sampled 44,100 times a second; 96 kHz means it is sampled 96,000 times a second. The sample rate primarily affects the range of frequencies that can be reproduced by a digital music file.

And the combination of the two determines the size of audio files. A CD can contain up to around 80 minutes, but if it were encoded at a different bit depth or sample rate, it would contain less music. A four-minute piece of music on a CD takes up 41.1 MB; at 256 kpbs (AAC or MP3), it takes up 7.5 MB. But jump to a 24/96 file and it is around 138 MB, though, using lossless compression, it can be shrunk by about 1/3 to 1/2 of its original size.

Is Bigger Better?

This is where the marketing comes in: bigger is always better. It could seem logical that higher numbers would result in better sounding music, but this isn’t the case. Let’s take the bit depth. 24-bit music, according to the marketing department, sounds better than 16-bit music. Yet 16 bits are more than enough to cover what human beings can hear.[8] Too broad a dynamic range can be harmful; if you set the volume to hear the quiet parts of the music, the loud sections could burst your speakers, and hurt your ears.

And that sample rate? Interestingly, CDs use a sample rate, as we saw above, of 44,100 Hz; not a random number at all. This number was chosen because the highest frequency that humans can hear is around 20,000 Hz. According to the Nyquist theorum[9], the sample rate of music must be at least twice the maximum frequency that humans can hear. Since it’s best to leave a little bit of wiggle room, audio engineers took 20,000 Hz, multiplied it by two, and then added bit of padding, just in case.[10] Most of us don’t even hear up to 20,000 Hz: and, as we age, our hearing deteriorates. I can’t hear above around 12,000 Hz; you can test your hearing here.

Yet high-resolution audio files at 96 kHz can reproduce sounds up to around 48,000 Hz. Dogs can hear sounds that high; but not humans. In fact, it’s very likely that your stereo system cannot reproduce sounds at such levels. Most standard stereo equipment reproduces sounds from 20 to 20,000 Hz. So for ultrasonic sounds to be reproduced, every element of the audio chain needs to be able to reproduce these sounds. If your amplifier can go up to 40,000 Hz, but your speakers or headphones cannot, no amount of voodoo or magic can make high frequencies audible.

While it is certainly possible to have stereo equipment that can reproduce ultrasonic frequencies, you’ll never hear them. Yet, very high sample rate music files can actually cause distortion. As an article on xiph.org[11] says, “If the same transducer reproduces ultrasonics along with audible content, any nonlinearity will shift some of the ultrasonic content down into the audible range as an uncontrolled spray of intermodulation distortion products covering the entire audible spectrum.” There are a lot of $10 words in a sentence, but what they mean is that very high sample rates — in this case, 24/192 – can actually make music sound worse; harmonic distortion can occur when the ultrasonics intrude on audible frequencies.

On top of that, hardly anyone can distinguish music at high sample rates from CDs. A number of blind studies have proven this, time and time again.[12]

“Music as It Was Intended to Be Heard”

One of the biggest marketing arguments for high-resolution music files is that “this is how music was intended to be heard.” Pono Music says, “[Musicians] want their music heard and experienced the way they brought it to life with great care and commitment, in the studio.”[13] This is how the music was recorded; this is how engineers heard it when they edited the music. Therefore, this must be better.

Two elements separate the recording studio – or, more correctly, the engineer’s control room – and home listening spaces. First, control rooms have high-quality monitors (speakers) which are neutral, and which are designed to provide the best possible audio fidelity. Second, control rooms are completely soundproof rooms with no parallel surfaces and completely absorbent walls. Again, they are designed to have no obstacles to reproducing the music as it was recorded. But you won’t have that at home, unless you have a very expensive listening room (and there are some people who go to this expense).

Some websites sell high-resolution files under the moniker “studio masters.” And, in fact, these files are studio masters; what engineers used in the studio. But that doesn’t mean that these are files that we should use when listening to music, and it certainly doesn’t mean that they’ll sound the same on home audio systems.

There is a very simple reason why engineers use high bit depths and sample rates when recording music. Digital music involves a lot of calculations; when you make changes to music, with equalization, speed changes, etc., you are multiplying and dividing numbers. When mixing and mastering an album, an engineer performs thousands of operations to alter sound. Each one of these calculations — to simplify — leads to numbers being rounded off. The bigger the numbers, the less of a chance there is for rounding errors to affect the music. But this doesn’t mean that we, as listeners, need the same types of files. We don’t manipulate these files; we may change volume, or even use some subtle EQ, but that’s it.

In some ways, suggesting that listeners need studio masters is akin to saying that instead of eating sausages, we should get all of the ingredients put together ourselves. Nevertheless, you will find many vocal audiophiles will provide a number of reasons why they need to listen to music files that contain sounds that they simply cannot hear.

However, if someone really wants to provide “music as it was intended to be heard,” they’d do a lot better to look at the mastering process that’s been destroying music in recent decades. Colloquially known as “the loudness wars,” music producers, prodded by record labels, use dynamic compression to increase the overall volume of music, making it sound horrendous. Since, in general, louder sounds better, or brighter, when you compare two songs, producers have been cranking up the volume to make their songs stand out. But string together an albums worth of overly loud tracks, and it’s fatiguing. But it’s a war of attrition, and our ears are the losers. No high-resolution files will make this music sound better, ever.[14]

Also, mastering is often done by someone other than the recording engineer, and someone who may not have been involved in the recording process. So is this music truly the way the artists and engineers intended you to hear it?

Listen Better

As I said in the title of this article: music, not sound. There is a small minority of music listeners who are obsessed by the idea of obtaining “perfect” sound. They go to great lengths, and great expense, to try and reproduce the sound that one hears in a concert hall. By focusing on sound quality alone, it can be easy to neglect the music. Such people may get frustrated if the music doesn’t sound good enough, and find it hard to become immersed in great music.

I’m a music fan. What I want most of all, is good music. Some of my best listening experiences have come on tinny record players or booming car stereos. If the music is good, then the sound quality is less important. This said, without getting obsessive, there are a number of ways you can make your music sound better without maxing out your credit card.

For portable listening, start by getting rid of those white earbuds in a bundled with your iPod or iPhone. Get better earbuds, or get proper headphones. With headphones, you get what you pay for, up to a few hundred dollars. After that price point, it gets a bit iffy.

If you listen to music on your computer, get rid of those little desktop speakers and hook up a real stereo. I strongly recommend getting a good DAC — a digital-analog converter — because the sound card in your computer is probably not great. (Though no DAC will help if your amplifier and speakers are poor.) I have a DAC between my Mac and my amplifier; I find that it does make a difference, providing a more detailed soundstage.

And if you’re listening to digital music — you’re reading this article, so I assume you are — make sure it is at sufficiently high bit rates. Apple’s iTunes Store sells music at 256 kbps, which, for nearly everyone, is indistinguishable from uncompressed music. If you use MP3 files, go for 320 kbps; it should sound just as good as CDs as well.

But unless you’re willing to spend as much money on your stereo system as you do on your car, and set up an acoustically-controlled room, there is simply no way that high-resolution files will make any difference to the music you listen to. Lots of people try and convince you that there is a difference, but most of these people simply want to take your money. And you have to ask yourself: of the ones who aren’t asking for your money, how many are desperately seeking validation for the very large sums of money they’ve spent on something modern science tells us they cannot hear.


  1. http://www.mcelhearn.com/whats-the-point-of-pono-and-why-are-ponos-numbers-bogus/  ↩

  2. I consider high bitrates to be at least 256 kpbs for AAC or 320 kpbs (or VBR V–0) for MP3 files. Check whether you can hear the difference: http://www.mcelhearn.com/can-you-really-tell-the-difference-between-music-at-different-bit-rates/  ↩

  3. The most common high-resolution music files are 24/48, 24/88.2, and 24/96. Pono will offer files up to 24/192, and some companies sell files up to 24/384.  ↩

  4. https://en.wikipedia.org/wiki/Compact_Disc_Digital_Audio  ↩

  5. Linear pulse-code modulation: https://en.wikipedia.org/wiki/Pulse-code_modulation.  ↩

  6. One must not confuse bit depth and bit rate, which is used to describe how much data is in a music file per second. For example, 256 kbps means that there are 256,000 bits of data per second of music.  ↩

  7. See Is Bits Really Bits?. And, how about a test? Check whether you can hear the difference between music at 16 bits, and the same music downsampled to only 8 bits: The 16-bit v/s 8-bit Blind Listening Test. I got 7 out of 10 when I did the test; that’s better than random.  ↩

  8. Dynamic range is quite complicated. See this article for more detailed information than you probably want.  ↩

  9. https://en.wikipedia.org/wiki/Nyquist_frequency  ↩

  10. There are also some other technical reasons why that specific sample rate was chosen. “Professional video recorders were originally used to prepare CD master tapes because they were the only recorders capable of handling the high bandwidth requirements of digital audio signals. Because 16-bit digital audio signals (and error correction) were encoded as a video signal, the sampling frequency had to relate to television standards’ line and field rate, storing a few samples per scan line. […] With three samples per line, 490 x 30 x 3 = 44.1 kHz, it is just right. […] Therefore, 44.1 kHz became the universal sampling frequency for CD master tapes. Because sampling-frequency conversion was difficult, and 44.1 kHz was appropriate, the same sampling frequency was used for finished disks.” Principles of Digital Audio, Sixth Edition, Ken C. Pohlmann. (Amazon.com, Amazon UK)  ↩

  11. https://people.xiph.org/xiphmont/demo/neil-young.html  ↩

  12. See, for example, The Emperor’s New Sample Rate.  ↩

  13. http://www.ponomusic.com/#faq Or Try for yourself.  ↩

  14. See The Future of Music and, for a more technical explanation, ‘Dynamic Range’ & The Loudness War. And The Dynamic Range Database is a list of more than 50,000 albums, showing their relative loudness.  ↩




19 replies
  1. William Allbrook says:

    Great article Kirk – I concur. One thing – would you like to comment on Mastered for iTunes – I was fascinated but unsure what to make of this – https://www.apple.com/uk/itunes/mastered-for-itunes/ I would love to compare the following on my HiFi – a CD converted to ALAC with the music I downloaded from the iTunes Store which is “Mastered for iTunes’. The music which as you say is the most important thing is – WomanChild https://itunes.apple.com/gb/album/womanchild/id618793978 – sounds awesome!

    Keep it coming!

    Reply
    • Kirk McElhearn says:

      Yes, that’s certainly something to look into. I’ll have to check with some music that I can compare. One of the risks is that if the MfiT files are even a slightly different volume, it’s very hard to do a valid comparison.

      Reply
  2. dtoub says:

    I completely agree, Kirk. Thanks. I’ve never been able to distinguish between an AIFF file and an AAC file, except by file size or extension.

    Reply
  3. Sergio Garcia says:

    Thanks for bringing up all these good points, especially the quality of the recording to start with, rather than the delivery output. As an example, I was looking for a copy of Quadrophenia to buy a few days ago, and in one of the ‘remastered deluxe’ editions on Amazon (http://www.amazon.co.uk/Quadrophenia-Who/dp/B005DMNS4A/) the first reviewer is disappointed and advises “Stick with an original vinyl copy or if you can find one the original german polydor cd which remains the best sounding digital version”, while the second reviewer can’t tell the versions apart at all…

    Now I just wonder if the 2013 ‘remastered for iTunes’ version might actually be superior to all previous CD versions (excepting maybe that mythical German recording) despite the ‘lower quality format’, just because the remastering has been taken seriously this time…

    Reply
    • Kirk McElhearn says:

      Part of the problem with music these days – at least for classic albums – is that there are multiple remasters, and it’s hard to know which sounds good. The Beatles are a good example: there have been a number of remasters over the years, and some fans like certain versions, others like different ones. The only thing to do is to do some research; or just pick any of them, since it’s the music that counts. :-)

      As for the Mastered for iTunes version, you’d need to know which source was used.

      Reply
  4. Scott Atkinson says:

    The fallback position, lately, for folks who don’t like compressed files is that having the originals in at least 16/44.1 preserves their options; they can transcode without fear of losing quality, so if, say, there are no more iPods some day and all the other audio players in the world drop aac support they’re all set. Also, if the mysterious missing link that “proves” cd quality really is audibly different from a high bitrate compressed file emerges, they’re covered.

    I think it’s hard letting go of that, because once you do – for some people – it feels like the walls are going up, like your listening future isn’t unlimited, It’s the same impulse that leads people to over buy music or books or whatever-your-poison is because at some point down the road you may want to hear it or read it or see it, even though that point often doesn’t arrive.

    Personally, I’m not great at this part, but I try these days to enjoy what I have, and not fret about what I’m missing or will never hear. It’s why I mailed off the several disc King Crimson “Court Of The Crimson King” set this morning. It had the album, the album in 24/96, the album in alternate takes, a needle drop of the mono version (which was all I really liked) and a 5.1 mix. I don’t even like King Crimson very much, and never really listened to the set, but as I was getting ready to ship it, I thought “What if I want to hear the high-res version some day?” I finished the bubble wrap and headed to the post office.

    Reply
    • Kirk McElhearn says:

      I didn’t know that had been released in mono. I love that album; I’ve loved it since I first heard it in the mid-70s. I’ll have to look for that. I don’t care about the 5.1 either; I don’t have a surround system.

      Reply
    • Antonio says:

      Such “mysterious link” is actually reference 12 in this article. They claim that in a well set up ABX test people recognized the higher sampling rate source at a rate close to 50%, the same as flipping a coin or as a deaf person. But one subgroup did “even worse” (according to them): women only preferred the higher sampling rate source at a rate of 37%. But as anybody minimally trained in statistics will tell you, the worst result in an ABX test is 50%. 37% is the same as 100-37 = 63%. In layman terms, women in that experiment (which may or may not represent the general population) can hear the difference between different sampling rate sources, all else being kept equal. It just happens they prefer the lower sampling rate, but as far as being able to tell the difference, they are to a great degree. No surprise the reviewers of the paper didn’t catch it, math literacy is as rare as flying pigs.

      Reply
      • Kirk McElhearn says:

        I agree that that is interesting. I wonder if it’s not some kind of bias that the researchers didn’t count on. Let’s be honest: most of the people who care about this issue are men. It’s possible that the women doing the test weren’t approaching it in the same way. But it is intriguing to see that kind of number, in either direction.

        Reply
      • Scott Atkinson says:

        Correct me if I’m wrong, but if 63 percent represents the ability of women to distinguish one from the other, that isn’t much better than 6 of 10, and that’s not really much better than a guess. Obviously, if what it really means is that, say, 80 or 90 percent of women can distinguish the two and of that group, six in 10 have a preference, that’s different.

        s.

        Reply
        • Antonio says:

          Of course what “much better” is is subjective. Would you say it’s much better and 70% or 80%? That’s a question of effect size and its practical impact. I am talking about the specific statement that there is no perceptual difference for different bit rates above the CD level. From a statistical standpoint, without the sample size it’s hard to make a calculation, but the number 63 suggests that there were 100 observations or it has been falsified (because 63 and 100 have no common factors). It’s extremely unlikely that such a result could be obtained if there were no perceivable difference between the two sounds. Try to throw a coin 100 times and get more than 63 or less than 37. That has a probability about 0.006. That doesn’t speak to the size of the effect, which could be very small and not worth $1 of your money. But the evidence shows that the sampling difference can be perceived by some people. Not very often and with the opposite preference than was expected, but if you accept the study was performed well, which I can’t because I haven’t read the original report, what the study shows is that sampling differences are perceivable. The article you cite draws, surprisingly, the opposite conclusion.

          Reply
  5. Scott Atkinson says:

    Completely OT, but I have a cloudy relationship with the music of both KC and the Dead. There are things about both bands I like – “Red,” for instance, but both manage to annoy me too. Usually it has to do with vocals/lyrics.

    Anyway, I probably didn’t make my point well earlier. I’m trying to say that it seems to me that people who obsess about lossless and high res are sacrificing listening pleasure today, however imperfect, for some kind of ideal of listening that doesn’t really exist. I’ve never heard an extreme top end system, but I’d be willing to bet that no matter how much money you spend, no matter how glorious your speakers are, you couldn’t reliably tell the difference between a high bit rate lossy file and lossless/high res. I think the issue here is the limits of our ears and how we process sound – and those are hard limits.

    Reply
  6. Martin says:

    I can hear something when I play the 20.000 Hz file, but it’s very quiet. Don’t you hear *anything* when you play that?
    I’m 38 years old.

    Reply
    • Kirk McElhearn says:

      I’d be surprised if you actually hear a 20 kHz tone. You might be hearing harmonic distortion at a lower frequency. What are you listening through? Speakers or headphones? This said, it’s entirely possible that you can hear it; some people can.

      Reply
  7. Karen says:

    I remain to be convinced that music at bit depth/sample rates above that of CD is worth bothering with. There is undoubtedly a desire to tell yourself that it sounds better, but I don’t feel that I can actually distinguish between them.

    However, for me the main issue is that I’d like to be able to purchase online music at 16/44.1. I can certainly hear the difference between a low bit rate MP3 and the same track ripped from CD at 16/44.1 on at least some types of material. If Pono and high-bit rate music push us away from 256/320kbps tracks then I welcome them.

    Reply
  8. PierOz says:

    Dear Kirk, it is a very complicated issue that you are addressing here, involving expertise in many domains such as acoustic, psycho-acoustic, music cognition, signal processing, sound engineering, neuro-physiology and more. When dealing with our perception of music in general and of the extreme high and low frequencies in particular as well as the making of devices to records and playback sounds, one needs to keep an open mind.
    You have a fair point when you’re stating that most people can’t hear above 20Khz (and under 20Hz) so we shouldn’t bother with high resolution files because they won’t sound better than cd, but it is a very simplistic answer to (again) a very complicated question.
    No one can argue, considering current research against the first point, but we can argue against the second (that high resolution files can’t sound better than cds). It is not because we cannot hear ultrasounds and infrasounds that they don’t have an effect on our perception. That has been shown with infrasounds, and recent research is also showing what is called an “hypersonic effect” with frequencies above 20Khz on the brain. That effect could explain why people, although they don’t hear a difference between high resolution files (but it should be said that many articles are conflicting on that matter, for example in the same journal that is mentioned in The Emperor’s new sample rates, the aes, an article by A. Pras (2010) concludes in opposition to the article by Meyer and Moran), like high resolution files better. This domain of research is in its infancy, so they might discover more things about our perception of ver high frequencies, or might not…but you can’t brush off the possibility of an effect based on one argument or because you consider this bogus.

    Audiophiles might be delusional, and so what, if people want to feed their delusion with big money why not? Others buy cars that can go to speeds they will never need. Also it is not illogical to think that with higher sample rates the resulting signal will be closer to the original signal (analogue master or newly recorded music), whether we can hear the difference or not is another matter.
    As for the marketing ploy, well yes it is all about marketing, selling music and audio equipment has always been about marketing since the first gramophone were introduced and the first music labels created to bring content and help selling more gramophones.
    I think it is up to the people to make their choice. You are advising to get a good DAC, well it’s half the way to being able to listen to high resolution files, since most of them can deal with 24bit/96Khz files. Then you need a good pair of cans, professional models such as AKG K701, or the more expensive k712, or Shure RH1840 can go up to between 30Khz and 39Khz. For exemple a DAC Magic XS and a AKG K701 would be around £250/300. No need to spend tens of thousands of pounds. And then people can see if they like high res files better or not.

    With sentences such as this one: “But unless you’re willing to spend as much money on your stereo system as you do on your car, and set up an acoustically-controlled room, there is simply no way that high-resolution files will make any difference to the music you listen to”, you sounds like an audiophile preaching for the opposite argument, because in reality we don’t know.

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply