One of the critical technical advances that enable modern high speed serial data links, such as SuperSpeed USB 3.0, is a data encoding scheme called 8b/10b. 8b/10b encoding is a proven means of overcoming a technical issue that arises when designing systems that use high data rates to transfer data over long distances (depending on the data rate and physical transmission medium, “long” in this context can refer to a few feet or many miles).
The issue that 8b/10b encoding addresses arises due to the nature of the differential receivers that make such high-speed data links possible. So to understand the use of 8b/10b encoding, we first need to understand some requirements of differential receivers.
Differential receivers were one of a series of key developments that enabled long distance, high speed serial data links.
Back in the early days of digital electronics, signal transmission was accomplished by having the receiver measure the DC voltage level of the incoming signal against a common ground shared by transmitter and receiver. In this simple world, using TTL logic as an example, a voltage of 0V represented a logical “0” and a voltage of +5V represented a logical “1”.
There were a number of problems in trying to extend this technology to higher data rates and longer distances, and one key problem was the inability to provide a stable and common ground reference level when systems became physically separated. A solution provided for this was the development of differential receivers, where the signal level is measured between two conductors, rather than against a common ground.
While the nominal “ground voltage level” might vary appreciably from one location to another, existing technology such as twisted pair conductors helped ensure that the DC offset was essentially the same for both wires, and therefore the differential voltage between the two remained constant and measurable by the receiver.
Differential receivers were also able to extract the clock signal (to obtain precise location of data bits) from the incoming signal by observing the rate of change of signals on the incoming transmission. By extracting the clock from the data, the receiver is able to “lock on” to the incoming signal and correctly translate the incoming signal back into data bits as originally transmitted by the remote device.
The Need for 8b/10b Encoding
However, there were some restrictions to this approach. One of those restrictions was the need to ensure that the incoming data provided variations in signal level at a sufficient rate that the receiver could continue to track the clock rate. For example, if the data being transmitted consisted of long strings of 0s (or of 1s), the receiver would see what appeared to be no signal on the line, and the “lock” would be lost.
A second problem was that, to function well, the differential receivers required the incoming data to effectively be DC-neutral, in other words that over periods of time longer than a few bits that the number of 1s received roughly matches the number of 0s received.
Both these constraints are not typical of real data, which often contains long strings of 0s or 1s (for example as filler at the end of a data file).
The solution to this issue was the development of an encoding scheme, in which the “real data” would be encoded using a scheme which ensured that no more than five “0” values or five “1” values would ever occur in a row, and also ensured that over time the total number of 1s transmitted would closely match the total number of 0s.
In order to accomplish this, each 8b byte was encoded into a 10b symbol. By adding two extra bits to each byte, the potential range of data symbols was four times as large as the possible range of original 8b bytes. By careful selection of values from this much larger range of possible symbols, each 8b byte could be encoded using a 10b symbol chosen to ensure that no more than five “0” values or five “1” values occurred in a row.
Furthermore, since the 1:1 encoding of bytes into symbols used up only a quarter of the available symbols, a second set of symbol codes could be selected for every possible 8b byte, and furthermore be selected to help compensate for excessive 0 or 1 values in the previously transmitted symbol(s).
So in 8b/10b encoding, each data byte has two different symbols, one selected to have slightly more “1′ values and the other to have slightly more “0” values. These different symbols are called positive and negative disparity, and the transmitter keeps track of the disparity and selects the appropriate symbol for the next byte to compensate for any disparity introduced by the previous symbol.
The entire 8b/10b encoding and managing this “running disparity” is handled by the physical layer and is completely transparent to higher levels of the firmware/software stack, which see only 8b data values. However, for test purposes it is critical to understand that the native data traffic actually being transmitted in a USB 3.0 data link is always being transmitted entirely in 10b symbols.