Monday 20 July 2015

How Do DACs Work?

All digital audio whether PCM or DSD stores the analog audio signal as a stream of numbers, each one representing an instantaneous snapshot of its continuously evolving value.  With either format, the digital bit pattern is its best representation of the analog signal value at each instant in time.  With PCM the bit pattern typically comprises either 16- (or 24-bit) numbers each representing the exact value of analog signal value to a precision of one part in 65,535 (or one part in 16,777,216).  With DSD the precision is 1 bit, which means that it encodes the instantaneous analog voltage as either maximum positive or maximum negative with nothing in between (and you may well wonder how that manages to represent anything, which is a different discussion entirely, but nevertheless it does).  In either case, though, the primary task of the DAC is to generate those output voltages in response to the incoming bitstream.  Lets take a look at how that is done.

For the purposes of this post I am going to focus exclusively on the core mechanisms involved in transforming a bit stream into an analog signal.  Aside from these core mechanisms there are further mission-critical issues such as clock timing and noise, but these are not the subject of this post.  At some point I will write another post on clocks, timing, and jitter.

The most conceptually simple way of converting digital to analog, is to use something called an R-2R ladder.  This is a simple sequence of resistors of alternating values ‘R’ and ‘2R’, wired together in a ‘ladder’-like configuration.  There’s nothing more to it than that.  Each ‘2R’ resistor has exactly twice the resistance value as each ‘R’ resistor, and all the ‘R’s and all the ‘2R’s are absolutely identical.  Beyond that, the actual value of the resistances is not crucial.  Each R-2R pair, if turned “on” by its corresponding PCM bit, contributes the exact voltage to the output which is encoded by that bit.  It is very simple to understand, and in principle is trivial to construct, but in practice it suffers from a very serious drawback.  You see, the resistors have to be accurate to a phenomenal degree.  For 16-bit PCM that means an accuracy of one part in 65 thousand, and for 24-bit PCM one part in 16 million.  If you want to make your own R-2R ladder-DAC you need to be able to go out and buy those resistors.

As best as I can tell, the most accurate resistors available out there on a regular commercial basis are accurate to ±0.005% which is equivalent to one part in 20,000.  Heaven knows what they cost.  And that’s not the end of the story.  The resistance value is very sensitive to temperature, which means you have to mount them in a carefully temperature-controlled environment.  And even if you do that, the act of passing the smallest current through it will heat it sufficiently to change its resistance value.  [Note:  In fact this tends to be what limits the accuracy of available resistors - the act of measuring the resistance actually perturbs the resistance by more than the accuracy to which you’re trying to measure it!  Imagine what that means when you try to deploy the resistor in an actual circuit…]  The resistor’s inherent inductance (even straight wires have inductance) also affects the DAC ladder when such phenomenal levels of precision enter the equation.  And we’re still not done yet
unfortunately the resistance values drift with time, so your precision assembled, thermally cushioned and inductance-balanced R-2R network may leave the factory operating to spec, but may well be out of spec by the time it has broken in at the customer’s system.  These are the problems that a putative R-2R ladder DAC designer must be willing and able to face up to.  Which is why there are so few of them on the market.

Manufacturers of some R-2R ladder-DACs use the term ‘NOS’ (Non-Over-Sampling) to describe their architecture.  I don’t much like that terminology because it is a rather vague piece of jargon and can in principle be used to mean other things, but the blame lies at the feet of many modern DAC chipset manufacturers (and the DAC product manufacturers who use them) who describe their architectures as "Over-Sampling", hence the use of the term NOS as a distinction.

Before moving on, we’ll take an equally close look at how DSD gets converted to analog.  In principle, the incoming bit stream can be fed into its own 1-bit R-2R ladder, which, being 1-bit, is no longer a ladder and comprises only the first resistor R, whose precision no longer really matters.  And that’s all there is to it.  Easy, in comparison to PCM.  Something which has not gone unnoticed … and which we’ll come back to again later.

Aside from what I have just described, for both PCM and DSD three major things are left for the designer to deal with.  First is to make sure the output reference voltages are stable and with as little noise as possible.  Second is to ensure that the switching of the analog voltages in response to the incoming digital bit stream is done in a consistent manner and with sufficient timing accuracy.  Third is to remove any unwanted noise that might be present in the analog signal that has just been created.  These are the implementation areas in which a designer generally has the most freedom and opportunity to bring his own skills to bear.

The third of these is the most interesting in the sense that it differs dramatically between 1-bit (DSD) and multi-bit (PCM) converters.  Although in both cases the noise that needs to be removed lives at inaudible ultrasonic frequencies, with PCM there is not much of it at all, whereas with DSD there is so much of it that the noise power massively overwhelms the signal power.  With PCM, there are even some DACs which dispense with analog filtering entirely, working on the basis that the noise is both inaudible, and at too low a level to be able to upset the downstream electronics.  With DSD, though, removing this noise is a necessary and significant requirement.

Regarding the analog filters, most designers are agreed that although different audio stream formats can be optimized such that each format has its own ideal analog filter, if a DAC is designed to support multiple stream formats it is impractical to provide multiple analog filters and switch them in and out of circuit according to the format currently being played.  Therefore most DACs will have a single analog output filter which is used for every incoming stream format.

The developers of the original SACD players noted that the type of analog filter that was required to perform this task was more or less the same as the anti-aliasing filters used in the output of the CD format, which they were trying to improve upon.  They recognized that those filters degraded the sound.  So instead, in the earliest players, they decided to upconvert the DSD from what we today call DSD64 to what we would now call DSD128.  With DSD128 the ultrasonic filter was found to be less of a problem and was considered not to affect the sound in the same way.  Bear in mind, though, that in doing the upconversion from DSD64 to DSD128 you still have to filter out the DSD64’s ultrasonic noise.  However, this can be done in the digital domain, and (long story short) digital filters almost always sound better than their analog counterparts.

As it happens, similar techniques had already been in use with PCM DACs for over a decade.  Because R-2R ladder DACs were so hard to work with, it was much easier to convert the incoming PCM to a DSD-like format and perform the physical D-to-A conversion step in a 1-bit format.  Although the conversion of PCM to DSD via an SDM is technically very complex and elaborate, it can be done entirely in the digital realm which means that it can also be done remarkably inexpensively.

When I say "DSD-like" what do I mean?  DSD, strictly speaking, is a trademark developed by Sony and Philips (and currently owned by Sonic Studio, LLC).  It stands for Direct Stream Digital and refers specifically to a 1-bit format at a sample rate of 2.8224MHz.  But the term is now being widely used to refer to a broad class of formats which encode the audio signal using the output of a Sigma-Delta Modulator (SDM).  An SDM can be configured to operate at any sample rate you like and with any bit depth you like.  For example, the output of an SDM could even be a conventional PCM bitstream and such an SDM can actually pass a PCM bitstream through unchanged.  A key limitation of an SDM is that they can be unstable when configured with a 1-bit output stream.  However, this instability can be practically eliminated by using a multi-bit output.  For this reason, most modern PCM DACs will upconvert (or ‘Over-Sample’) the incoming PCM before passing it through an SDM with an output bit depth of between 3 and 5 bits.  This means that the physical D-to-A conversion is done with a 3- to 5-stage resistor ladder, which can be easily implemented.

These SDM-based DACs are so effective that today there are hardly any R-2R ladder DACs in production, and those that are
such as the Light Harmonic Da Vinci can be eye-wateringly expensive.  The intermediate conversion of an incoming signal to a DSD-like format means that, in principle, any digital format (including DSD) can be readily supported, as evidenced by the plethora of DSD-compatible DACs on the market today.  Because these internal conversions are performed entirely in the digital domain, manufacturers typically produce complete chip sets capable of performing all of the conversion functionality on-chip, driving the costs down considerably when compared to an R-2R ladder approach.  The majority of DACs on the market today utilize chip sets from one of five major suppliers ESS, Wolfson, Burr-Brown (TI), AKM, and Philips although there are others as well.

Interestingly, all of this is behind the recent emergence of DSD as a niche in-demand consumer format.  In a previous post I showed that almost all ADCs in use today use an SDM-based structure to create a ‘DSD-like’ intermediate format which is then digitally converted to PCM.  Today I showed the corollary in DAC architectures where incoming PCM is digitally converted to a ‘DSD-like’ intermediate format which is then converted to analog.  The idea behind DSD is that you get to ‘cut out the middlemen’ - in this case the digital conversions to and from the ‘DSD-like’ intermediate formats.  Back when SACD was invented the only way to handle and distribute music data which required 3-5GB of storage space was using optical disks.  Today, not only do we have hard disks that can hold the contents of hundreds upon hundreds of SACDs, but we have an internet infrastructure in place that allows people to download such files as a matter of convenience.  So if we liked the sound of SACD, but wanted to implement it in the more modern world of computer-based audio, the technological wherewithal now exists to support a file-based paradigm similar to what we have become used to with PCM.  This is what underpins the current interest in DSD.

To be sure, the weak link of the above argument is that DSD is not the same as ‘DSD-like’, and in practice you still have to convert digitally between ‘DSD-like’ and DSD in both the ADC and the DAC.  But a weak is link is not the same thing as a fatal flaw, and DSD as a consumer format remains highly regarded in many discerning quarters.