Friday 11 April 2014

ANALYSIS: A Comparison of DSD Encoders & Decoders (KORG AudioGate, JRiver MC, Weiss Saracon)



Hello guys & gals, I've seen the question asked of comparing various DSD conversion programs on message boards over the years but have never seen someone try to "compare and contrast" with objective analysis. Let's at least give it a try here. I don't promise unequivocal answers, but hopefully a decent stab at it :-).

Remember that any conversion between DSD to PCM is a "lossy" process. Therefore, it is of course preferable to keep PCM sourced recordings in PCM and DSD likewise if possible. There will be some compromise in the accuracy each time conversion happens. Even though the bitrates for DSD64 and 24/96 PCM may be similar, the modulation technique used to represent the resultant sound wave is different (as per the above image). The question of course is how much difference and if it's quantifiable.

This question of conversion is important because as I have discussed before, many if not most DSD releases have gone through some kind of conversion for the flexibility and ease of editing in the PCM domain. The most blatant examples of conversion are the ones sourced from 44/48kHz material, but I'm sure many others are from 96/192kHz origin but they would not be easy to differentiate from a DSD original.

I. Procedure:


I do not have access to DSD recording gear but I can convert recorded PCM to DSD and back again to see what the conversion process does. For example, the PCM test signal from RightMark Audio Analyzer can be sent through the conversion process and we can see what happens to it to get an idea of the amount of degradation. For these tests, I chose to use the 24/96 test signal which I feel is a very reasonable hi-resolution specification exceeding DSD64 in a number of resolution domains. I know that 88kHz may be better as an integer multiple of the 2.8MHz sample rate but I figure these days in a high resolution studio, 24/96 is probably the standard and is common as high-resolution HDTracks and Blu-Ray audio releases.

Here then is the general procedure:
- Take the 24/96 RightMark test signal.
- Convert the PCM to DSD using the various encoders.
- Reconvert the DSD file back into 24/96 PCM using each program.
- Analyse effects of the 2-way conversion and differences between the programs.

I decided to use 3 commonly available conversion programs for this test - something free, something a consumer can afford, and finally the professional "standard" made available to me thanks to a friend who runs a studio. This will result in a total of 9 final 24/96 WAV files to "measure" with the RightMark software (3 PCM-to-DSD encoding x 3 DSD-to-PCM conversion). The 3 software programs used for conversion are:

1. KORG AudioGate 2.3.3. This software is available free. All it takes to run conversions is access to your Twitter account so the software tweets each time a conversion takes place. Small price to pay for the ability to do the conversion I suppose. I used the default DSD encoding (DSDIFF / Stereo Interleaved / 2.8MHz / 1-bit) and decoding to PCM (WAV / Stereo Interleaved / 96kHz / 24-bit) parameters. I noticed that AudioGate will apply a +6dB gain with the DSD to PCM conversion (-6dB DSD is equivalent to 0dBFS PCM, not uncommonly this standard is not followed and +6dB gain can result in clipping).

2. JRiver Media Center 19.0.117. I've used this program before to test the PCM-to-DSD conversion playback last year. You can also save the resultant PCM --> DSD and the converse DSD --> PCM conversion files as well. The DSD --> PCM conversion happens in 24/352.8 so I used the best resampler I have - iZotope RX 3 - to convert back to 24/96 for final analysis using a steep filter at 48kHz. There is also no +6dB gain applied so the default volume of the PCM output file is softer than with AudioGate and Saracon at default settings.

3. Weiss Saracon 01.61-27. The standard DSD <--> PCM conversion package used by a number of places like Channel Classics, many HDTracks releases, Pentatone... Again, I just used the default settings for conversion to DSD (dff, CRFB 8th Order, 0 gain, 2.8224MHz, Auto channel mode, Smart Interleave, Enable Stabilizer). Likewise the conversion back to PCM was with default settings (WAV, 24-bit fixed point, TPDF dither, 96.0kHz, +6 dB gain, Smart Interleave).

II. Result:

As usual, I'm going to present the data as summary charts to start. There are 3 DSD encoders and the same 3 can be used to decode DSD, so let's just present them organized by the encoder used. When I say something like "AudioGate then JRiver", I'm referring to the use of AudioGate as the DSD encoder, then using JRiver to do the conversion back to high-resolution PCM to be analyzed by RightMark (remember, for JRiver's case, I also used iZotope RX 3 to resample from 24/352 --> 24/96).

AudioGate as DSD encoder.
JRiver as DSD encoder.
Weiss Saracon as DSD encoder.
As you can see, the first column in each table is the 24/96 RightMark PCM test signal with no conversion done. These would be the ideal numbers if one could measure a perfect DAC/ADC setup or in this case the results of perfect conversion.

The rest of the columns reflect what happens to the 24/96 PCM test signal as it goes through the DSD conversion and decoding steps. Remember that RightMark is analyzing the audible 20Hz to 20kHz spectrum only. As you know, DSD64 conversion adds quite a lot of ultrasonic noise if left unfiltered and this would result in some poor noise levels and lower dynamic range if frequencies >20kHz were analysed.

Indeed, various amounts of distortion and imperfections can be seen. On the whole, it's far from bad though. At worst, the cumulative noise level is still down below -120dB and dynamic range >120dB with each of these encoder/decoder pairs.

Comparatively, you can see the free KORG AudioGate encoder table above seemed to have the worst results in terms of noise level irrespective of what other software was used to convert back to PCM. This is followed by JRiver and then Saracon puts out some very fine numbers.

There's a similar tendency when comparing the DSD-to-PCM decoder used. In general, the JRiver and Saracon DSD-to-PCM conversions (columns 3 & 4) resulted in better measurements of noise level, and dynamic range than AudioGate (column 2).

Let's now have a look at some individual graphs to see what's going on - here's using AudioGate to encode PCM-to-DSD:
Frequency Response
Notice the different software used to convert DSD back to PCM all have different low pass filters. As expected, PCM (white) is flat all the way to 48kHz. AudioGate (green) uses a very weak filter and is only attenuated by <1dB at 48kHz, followed by Saracon. JRiver at the default "Safe" setting has a steep 24kHz 48dB/octave slope applied as noted here (you can change this if you want up to 30kHz cutoff, 50kHz cutoff, or filter turned OFF).

Noise Level
There's deviance from the PCM noise floor using AudioGate DSD conversion as you can see. AudioGate is more noisy at converting PCM to DSD than the other programs (as will be evidenced later). The noise floor also isn't as smooth as the others (interesting notch at 10kHz and 20kHz).

In comparison, let's have a look at the JRiver PCM-to-DSD encoding:
Frequency Response
Noise Level
The frequency response curves are similar to the AudioGate DSD encoding representing the respective low-pass filter settings of the DSD-to-PCM converters. The main difference is with the noise level. As you can see, JRiver as DSD encoder is able to maintain a very clean noise floor essentially equivalent to 24-bit PCM until about 13kHz before rising - and this low noise floor is maintained by JRiver and Saracon when reconverting back to PCM. The AudioGate DSD-to-PCM conversion in comparison has a higher noise floor throughout the audible spectrum - perhaps a higher level of dithering is being applied?

Finally, let's look at Saracon used as the PCM-to-DSD encoder:
Frequency Response
Noise Level
A clean PCM-like noise floor all the way to 20kHz is achievable after going through Saracon DSD encoding but this quickly increases thereafter. Again, AudioGate conversion to PCM results in a higher noise floor which I speculate is due to stronger dithering.

III. Conclusion:

Since DSD <--> PCM isn't a straightforward process (like say resampling in PCM), as expected, at a "microscopic level", conversion software does make a difference in resolution.

What is much harder to quantify is audibility. Those frequency response, noise floor, distortion, crosstalk results are all below what I believe are human thresholds of audibility and overall there is minimal change to the 24/96 PCM original signal within the audible frequency range. Remember, the results I show here are with both conversion to DSD and back again to PCM, not just a single conversion step. Yet, I have seen commenters on-line insisting that the conversion results in audible deterioration in sound (even with just a single step like DSD --> PCM).

Looking at these 3 software programs, we can say with some certainty from an objective perspective that Saracon PCM-to-DSD transcoding maintains the lowest noise floor from 20Hz to 20kHz. JRiver is also very good in this respect, while AudioGate's results are less accurate but obviously still very good and of questionable audible significance given that the difference is still below the measured noise floor of all except maybe the very best DACs.

Of course, one has to pay big bucks for Saracon compared to the free AudioGate software!

As for DSD-to-PCM conversion, the main difference appears to be where each program has decided to put the low-pass filter to remove DSD's ultrasonic noise. Of the 3, JRiver has the most conservative low-pass filter at 24kHz (with small notable effect beginning around 20kHz) by default. Saracon allows a bit more to pass through up to around 30kHz, and AudioGate allows essentially everything to pass through up to 48kHz with 24/96 sampling. The only other difference seems to be a stronger dithering algorithm (I'm guessing here) with AudioGate such that the noise floor is marginally higher than the others. Again, we're looking at differences way way down in the noise floor so it really should not be an issue. I think the real question is where you think the low-pass filter should be set for DSD64 material (ie. at what point is recorded ultrasonic signal drowned out by noise and not worth keeping?)

From what I see here, I'm quite happy that Saracon is used in most commercial releases I've come across for DSD-to-PCM conversion. Within the 20Hz to 20kHz audible spectrum, it does appear to be the best even though I highly doubt one could go wrong with any of these. Just remember that the steep low-pass filter in Saracon means there's nothing above ~40kHz and therefore no point buying a Saracon DSD converted file above 96kHz (88kHz is all that's needed).

Over the years, I've listen to original DSD and compared to PCM conversions at 24/88 using Saracon and AudioGate output level matched as best I could (using the TEAC UD-501, never tried formal ABX or blinding). IMO, it's tough to assess since you can't instantaneously switch from DSD to PCM. The PCM converted files sound good to me and I would not hesitate to archive the DSD64 library as 24/88. Whatever difference has always been subtle at best (despite claims from the DSD faithful that somehow DSD sounds much better). I suppose it's possible that different DAC devices could also sound different depending on PCM or DSD input.

Has anyone out there done an ABX or other controlled listening test with DSD-to-PCM conversion? Would love to hear of your experience and preference... 

---------------

Rant of the week...
In the high-fidelity audio world we've often discussed the ills of severe dynamic range compression (DRC). I'm just going to go on my soapbox for a couple minutes and complain also about the ills of DRC for soundtracks these days... Notice how LOUD TV shows have become lately? A couple years ago, I tried watching NBC's Hannibal. Not only was the pacing terrible, meant for folks with ADHD, but the audio was so annoyingly grating that I could not tolerate more than 3 episodes. (I don't know if the series improved after those 3 episodes...)

More recently, I've become annoyed by the recent Cosmos: A Spacetime Odyssey hosted by Neil DeGrasse Tyson playing on Fox and National Geographic Channel. I mean... COME ON PEOPLE! This is a science program. This is a documentary (with some science fiction entertainment thrown in). WHY DOES IT HAVE TO BE SO LOUD? It's like there's no subtlety left... No opportunity to whisper... No opportunity to wonder... No opportunity to enjoy the eye-candy of some excellent CGI graphics without the blaring of some "majestic" soundtrack through many parts of the show. Aren't the ideas being presented supposed to be what it's all about? But yet at times, the narration gets muddled by the background audio.

While I can still enjoy Cosmos 2014 with my kids for the topical presentation, I'm left wondering how much better it could have been to allow the dialogue to take center stage and the background soundtrack to accentuate the emotional impact instead of being ridiculously front-and-center as if I'm supposed to watch this program on a tiny smartphone screen on the subway (maybe that's the target audience!). As usual, it's hard to know who to blame - is it the sound engineers working on this series behind the mixing console or the folks manning the TV station transmitting the signal running it through their compressor? Unfortunate.

I'll end with a quote from Carl Sagan. Certainly worth contemplating when reading comments posted on the Internet in general... (Not just as audiophiles.)

"We live in a society exquisitely dependent on science and technology, in which hardly anyone knows anything about science and technology." Carl Sagan (1989) [good article BTW]

I wonder what Mr. Sagan would think about the current state of affairs regarding the level of understanding of science in our society today. I suspect if he were still alive (he died in 1996), he'd be impressed by the access to information and interconnectedness we have these days through the Internet. That's not necessarily saying a lot though about the level of understanding.


Still a great read after all these years... Originally published 1980.

Saturday 5 April 2014

MEASUREMENTS: Nexus 7 to Audioengine D3 (A "Kinda Portable" Audiophile Playback)

Okay, for fun, I thought I'd grab a few measurements of something... Somewhat... "Portable" :-)



What you have here is my Nexus 7 tablet connected to an "on the go" cable (5" male microUSB to female standard USB, off eBay - pack of 3 for $10) --> Audioengine D3 DAC/amp --> Sennheiser HD800. Unfortunately, The D3 would not power up consistently when plugged into the Nexus 5 smartphone since that would have been even more portable! Looks like the D3 demanded more power than the Nexus 5's USB port could deliver.

I got USB Audio Player Pro software for the Android in order to get the USB DAC working. Unfortunately USB Audio Class 1/2 devices are not supported by Android by default. You can see the basic interface above (I happened to be playing some Bruno Mars Unorthodox Jukebox). It recognized the D3 without a hitch. It sounds the same to me connected to the HD800 like the other machines I tested the Audioengine D3 with last time so no need going into any subjective evaluation here.

I was more interested in whether the objective data showed any difference between this "mobile" set-up compared to the laptops / desktops.

RightMark Results:

Remember the clipping at 100% with the Audioengine D3. All the measurements are done with hardware volume attenuated to 92%.

16/44:

Frequency Response
Noise Level
THD
IMD
24/96:

Frequency Response
Noise Level
THD
IMD
The first column in the summary tables is the Nexus 7 + Audioengine D3. The second column is the ASUS Taichi laptop connected to the Audioengine D3. Finally the 3rd column is the Nexus 7 natively without using the external USB DAC.

As you can see, there is no substantial difference whether the ASUS laptop/ultrabook or Nexus 7 was used in either the 16/44 or 24/96 test case. The DAC determines the final audio output, not the source "transport" device.

The Audioengine D3 is substantially better as a DAC of course. As you can see, the Nexus 7 + D3 easily outclasses the native Nexus 7 audio output off the headphone jack. It has a flatter frequency response with lower noise floor measurable even at 16/44. The difference is more evident with a 24-bit audio signal... The 24/96 frequency response demonstrates that the native Nexus 7 in fact is incapable of 96kHz sample rate.

Jitter:



Slight difference between the 2 J-Test measurements. The noise floor seems a little bit higher on the whole with the Nexus 7 when I ran this test making the peaks such as the 16-bit modulation pattern less obvious. Also a little cleaner around the 24-bit 12kHz primary tone with the Nexus 7. The differences are down around the -120dB level and would not be audible IMO.

Conclusion:

Okay... So this set-up isn't exactly a portable device. But it does demonstrate that you can get excellent sound out of a small Android device with the USB DAC that is objectively equivalent to using a standard computer.

Truth be told, I don't need "audiophile quality" sound in a portable device. I can't remember the last time I listened to my smartphone or iPod somewhere quiet with expensive headphones. On-the-go, convenience trumps everything else IMO so there'd be no way I'd bother with full sized headphones. If I did bring full-sized cans around on a train/plane/subway, it certainly would not be the expensive high-resolution open design headphones!

Day to day, I have my Nexus 5 phone with me. There are a few albums saved on the phone as MP3 or FLAC when I want to listen. Otherwise there are countless apps to listen to Internet radio and music streaming services which is what I listen to most. The days of the isolated non-networked portable music player are long over for me and probably the vast majority of music lovers.

------------------

For fun I decided to jump on the Geek Out bandwagon for a spin and see how it goes.

Looks like we're starting to see some user reviews coming out now, the first more formal one I see being this one at Part-Time Audiophile. So far it looks encouraging. As expected there are a couple of comments about the heat production of the 1W model. No comment about how it handles DSD and it looks like those guys are Mac-centric, so not clear how it works out in the Windows world (ie. drivers). For what it is, I find purely subjective reviews interesting to read but I would prefer something a little more than the usual "drive by shooting" ;-).

After a bit of humming and hawing, I decided to go for a blue "Super Geek" ($250, 720mW, 3.4Vrms peak) model as the best compromise for my case. My rationale is simple... The amplitude difference between 720mW to 1W is only 1.4dB (as compared to 2dB between 450mW to 720mW). I'm still concerned about heat production since this is a class A design which sucks full power all the time... Assuming the enclosure design for heat dissipation is the same, the lower power model should drop the running temperature a few degrees. I can see myself using this as a line-level DAC if it measures really well and tap the headphone amp feature only on occasion (the Audioengine D3 appears lighter and smaller for travel). As a 'standard' USB DAC, this also means it'll likely be "on" all the time so I'd rather not have something too warm sitting on my table. Furthermore, with a laptop computer, an inefficient class A device means more power drainage. The $50 saved is trivial for this hobby and not really an issue.

Anyhow, once I get this, I'll let you know some results... It will be interesting to see how this compares to the TEAC UD-501 which has the same DAC chip and similar feature set (NOS, DXD, DSD, etc...). Hopefully it doesn't take too long to ship out - my understanding is that only the 1000mW "Super-Duper Geek" model is released so far.

Juergen mentioned having a look at the HpW-Works software package for jitter analysis. Might just do that although I remain unconvinced it makes any audible difference in 2014 especially with asynchronous USB. I have yet to see a good example of a decent modern piece of equipment where jitter can be shown as the culprit for impairing the sound quality.

Tonight's music:
Kodo - Mondo Head - Although I generally prefer the more traditional sound of Tsutsumi, this one not only sounds nice in stereo but fantastic in multichannel off the SACD!

Enjoy the tunes everyone!


Wednesday 2 April 2014

MUSINGS: On Experts, Experience, and Opinions...



So last night, instead of going to bed early as I was supposed to, I decided to have a look at this interview of Allen Sides from Ocean Way Recording on TWiT.TV. As usual, Scott Wilkinson does a fantastic job with the interview and takes questions from the audience.

Obviously, Mr. Sides is a man of many years of experience and can speak authoritatively on MANY topics related to audio hardware, studio production, and historical anecdotes based on those years.

But some things bothered me. Around 16:00 there was talk about DSD: "somehow between making the recording and that SACD it doesn't sound quite as good as it should". Really? By 17:00, there's discussion about CD copies, different stampers sounding different, then an anecdote on Mariah Carey and how the pressed CD sounded bad compared to the reference. Okay... Maybe... How about someone ripping the disks and comparing the data integrity and talking about that? Given that Mariah was married to Tommy Mottola until 1998, this anecdote is now at least 16 years old so any hope of forensic assessment is long gone - does it still apply as a generalization these days?

Things get really bizarre by 26:00 - "I have never been able to even make a copy of a CD that sounds as good as the CD I started with... It always sounds worse". As you can see, Mr. Wilkinson was perplexed and commented that "it's almost like generational loss in analogue" that Mr. Sides is referring to. "Multiple degenerations"? Please...

I wished Mr. Wilkinson would have done a follow-up question like - "What if you bit-perfectly ripped that original CD to a computer - does it sound the same then?" "How about if you copy to a different hard drive, does it sound different?"

It's also clear that there are some limits to Mr. Sides' knowledge/experience which most of us in the hobby world would have no difficulty discussing. (34:30) Q: "What's your idea of FLAC encoding?" A: "I'm really not that familiar with it." Fair enough, a person cannot know everything.

Although I'm not that old at this point in my life, I have learned some things in my "travels" both personally and professionally. One which I hold dear is that no matter how much we can respect and trust the "experts" for their lived experience and knowledge, they (like us) are all just human. And as humans we all have idiosyncrasies and biases. In this case, I don't think it's much of a stretch for any of us who have spent hours on our audio systems, used EAC or dBpoweramp for ripping to ensure bit-perfect copies, to stand up with good confidence and tell Mr. Sides that he's just plain wrong about not being able to make a copy of a CD that sounds identical. The fact is that in more than 30 years of the existence of the CD format, there has been no evidence of this when variables are controlled for (eg. ensuring that bit perfect copies were achieved, the rater was blinded, etc.). If indeed "generational losses" were possible with digital, this would already be a well known fact and there'd be no uncertainty whatsoever! Moreover, if this were fact, it would change significantly how we deal with accuracy of our digital data (hey... how can I be sure that those numbers in my bank account are accurate?!). Experts can provide educated opinions, but ultimately they are just opinions and not necessarily fact. The same goes for his strange comment about SACD not sounding as good as it should between studio and the physical disk. What is more likely, that the digital data somehow mysteriously changed over time or that his own psychological expectations changed as the memory of the live studio event consolidated? (Assuming of course that the data wasn't altered by some mastering engineer along the way.)

In this world, there are many mysteries yet to be discovered and likely much we as a species will never know. But digital audio systems which are inventions of the human mind based on mathematical constructs and technologically engineered devices (like the CD) that ultimately changes the physical world (sound waves) were not produced by serendipity. I do not feel it's good enough that we should just shrug our shoulders and declare some experiences to have enduring truth like parts of this interview. That surrender to logic leads us into the realm of "anything is possible!" and ultimately the slippery slope on the path to "snake oil". This is especially significant in the impact on those already obsessive-compulsive and perfectionistic (ahem, like many audiophiles). Maintaining an objective approach hopefully allows a counterbalance to this. An opportunity to take a step back and question the things which seem to make no sense. An opportunity to explore reality with techniques and at times instruments of greater sensitivity than that which we are endowed with within this mortal shell. Although the human mind is the best "instrument" to perceive the beauty of music, accuracy of the reproduction chain is a different matter and can be detected by the use of objective techniques with obviously greater sensitivity.

It's fun listening to interviews like this and I certainly felt it was time well spent nonetheless.