Channel equalizations have become an essential mechanism that enables today’s high-speed serial links. There are many equalization schemes, such as transmitter emphasis, receiver CTLE (continuous time linear equalizer), FFE (feed-forward equalizer), DFE (decision feed-back equalizer), and FEC (forward error correction), that can be designed and utilized at different locations within a link. Though there are general theories with these EQ schemes, the actual usage, implementations, constraints, and most importantly, the effectiveness against various types of channel aliments among these EQ components are not well known or documented. Further, as the data rate reaches 112Gbps and beyond, along with advances in transceiver circuit designs and semiconductor process nodes, there are or will be changes in terms of complexity and cost, when these EQ schemes are implemented.
In this paper, we explain and investigate the theory, implementation, constraints, and cost of each EQ schemes mentioned above. Then we quantitatively analyze the effectiveness of each EQ scheme against channel ailments. The performance matrix will be given by simulated diagram height/width, SNR (signal-to-noise ratio),e tc., through design experiments using realistic channels. The outline of this paper is as follows:
- Theory and principle of channel equalization schemes: Transmitter pre-/de-emphasis, receiver CTLE, FFE, DFE, and FEC.
- Implementation, constraints, and cost of each EQ schemes: We compare various types of EQ implementation techniques, such as analog and ADC-based designs, and their trade-offs.
- EQ effectiveness analysis: We will conduct design of experiments, in terms of channel characteristics, noise manipulations, and EQ configurations, to examine the performance of each EQ schemes.
Finally, we will conclude the study by looking at the findings and results from the link level and perform trade analysis on the link design.
Overview of Channel Equalization Schemes
Wired line channel equalizations started to become a typical feature of high-speed I/O (HSIO) links when PCISIG introduced the second generation of PCI-Express standards where the data rate is 5.0Gbps. What was provisioned in PCI-Express Gen. 2 links is transmitter emphasis. Since then, the HSIO data rates have been doubling every 2 to 3 years [1]. Channel equalization techniques have become the key enabling technology for the speed growth [2][3]. Within today’s start-of-the-art 50~56Gbps serial link devices, we can find various types of equalization schemes deployed.
The main goal of channel equalization schemes is to improve signal-to-noise-and-distortion-ratio (SNDR) at the end of a link. The reason is no less than that we would like to recover the information transmitted from the source by using data slicers in the receiver. As the transmitted waveform will be distorted and degraded by the channel components, e.g. ISI (inter-symbol interference), jitter [4], which are caused by bandwidth of communication mediums, timing variations from reference sources and clock distribution networks, and noises [4], which comes from crosstalk power sources, and power distribution network, the equalizer’s job is to minimize these non-idealities so that the receiver can detect and recover the information in higher confidence.
Transmitter (TX) Emphasis
TX emphasis is an analog waveform treatment that introduces controlled peaking at the transitions. Effectively, this scheme pre-conditions the TX output waveform with a high-pass filter characterized with the voltage differences between the peak and the shoulder and the time duration of the peaking portion of the waveform (see Figure 1). Implementation wise, when peaking signal is added to the original TX output waveform, which results in greater peak-to-peak amplitude than the original non-emphasized waveform, it is called TX pre-emphasis. On the other hand, if the shoulder portion of TX output waveform is suppressed while keeping the same peak-to-peak amplitude, it will be called de-emphasis.
TX emphasis usually does not scale with link speed because the peaking duration is fixed by the TX’s driver circuit design where the peaking will be trigger by each data transitions. Therefore, it will be effective within a fixed data rate range and its equalization performance degrades when links’ data rates are farther away from the design target frequency. Further, compensation can be applied to the main cursor only. TX emphasis, due to its unique characteristics, can provide sub-cursor channel compensation where the peak frequency can be greater than that of TX FIR method (see below).
Transmitter FIR
The full name of TX FIR is transmitter with FIR EQ (finite impulse response equalizer). TX FIR applies equalizations using a FIR filter and the FIR is synchronized to the transmitter clock as shown in Figure 2. TX FIR provides several advantages over TX emphasis schemes:
- TX equalization scales with link data rate because FIR is clock driven
- Multi-tap FIR can better compensate different channel characteristics
- TX FIR can compensate both pre-cursor and post-cursor ISI
Implementation wise [5], TX FIR is more complex than TX emphasis and requires a clock source. TX FIR is usually subject to peak power constraint where the maximum output amplitude will be limited to the non-equalized amplitude level. Because of the peak power constraint, TX FIR will reduce the effective average output amplitude and, hence, reduce the energy received at the end of link.
RX CTLE
RX CTLE stands for receiver continuous time linear equalizer. RX CTLE circuitry [6] has a frequency response that compensates, or reverses, a channel’s frequency response (see Figure 3). If designed properly, it will result in a relatively flat frequency response and, hence, restore the received signal to its original form. RX CTLE can be active, where it boosts the amplitude of the output signal, or passive, where it attenuates the low frequency contents of the incoming signal. There are advantages and disadvantages of active and passive CTLE designs. For instance, active CTLE designs can normally improve SNR, but it can be subject to nonlinear behaviors such as DC gain compression. Passive CTLE designs usually will be linear but result in even smaller output signal levels.
CTLE is capable of compensating both pre-cursor and post-cursor ISI and is usually power efficient. Circuit designers can produce a tunable CTLE design where its AC gain and DC gain can be adjusted to match the channel characteristics. Due to these natures, CTLE exists in almost all HSIO receiver designs.
RX FFE
RX FFE stands for receiver feed-forward equalizer. In theory, RX FFE is equivalent to the TX FIR as they are both FIR-based and linear [6]. When a RX FFE is implemented in the analog domain, one will need a sequence of delay lines where the incoming signal will be buffered and summed or subtracted according to the FFE coefficients (see Figure 4). Otherwise, FFE can also be implemented at the bit or symbol level where equalization is done via symbol-level convolutions of FFE coefficient and sampled input data stream. RX FFE is usually paired with an adaptation scheme where FFE coefficients are derived from the channel characteristics. While RX CTLE and FFE are both linear equalizers, the adaptive nature of FFE makes it more versatile in dealing with wide variety of channels.
RX FFE is more expensive to design, implement, and use in many ways. First, a FFE requires clock to operate. This means that the receiver either is able to recover clock timing from the incoming signal or knows the link’s operation frequency. This is especially challenging for analog FFE design. We’ll have further discussions in the following sections.
RX DFE
RX DFE stands for receiver decision feedback equalizer. DFE uses an IIR (Infinite impulse response) structure where the sum of past decisions (data symbol determined by the slicer block) adjusted by DFE coefficients is used to minimize the errors at the target symbol levels (see Figure 5). DFE enjoys several distinct advantages from the above mentioned EQ schemes: First, due to its IIR structure, DFE can correct large amount of ISI with relatively short tap length for certain channel characteristics. Second, because the decisions are free of noises, DFE is capable of compensating channel ISI but without amplifying the noises from the link and devices.
Like RX FFE, DFE requires a clock to operate. So it is more expensive in terms of implementation and operation than CTLE. Further, with its feedback scheme, it is also subject to burst errors once a wrong decision is made as it will result in consecutive errors at output. Burst errors are shown to impact the performance of forward error correction (FEC). We will further discuss this topic in later sections.
Forward Error Correction (FEC)
Forward error correction [7] has become an essential part of serial links when the data rate reaches 25Gbps and above. The reason is that it has become more challenging to achieve the desired BER of 10-12 or 10-15 with the equalization schemes alone (especially given the ever shrinking timing budget and small received signal at the receiver). FEC can improve SNR or BER by 104 to 109 with the presence of random errors only or by less amount when burst errors exists in the link.
PLL and Clock and Data Recovery (CDR)
PLL and CDR are usually not considered parts of EQ schemes in a serial link. However, they play a crucial part in a serial link’s performance. By referring to the serial link jitter and noise classifications [4], PLL and CDR are the only mechanisms that can compensate sinusoidal jitter (SJ), bounded uncorrelated jitter (BUJ), and random jitter (RJ). If one further correlates to the HSIO challenges today, these jitter components are the main factors that close the link margins at low BER. We will not actively discuss and study the roles of CDR and PLL in EQ performance but, nevertheless, one should keep these two in mind when developing serial links.
Characteristics and Performance of Equalization Schemes
Common EQ Architecture with Data Rate < 56 Gbps
Though every HSIO device is designed differently, the available technologies, in terms of semiconductor materials, processes, circuit implementations, and micro-/macro-level controls algorithms, as well as their levels of maturity make the overall equalization architecture converge to certain similar schemes. To be more specific, a practical or reasonable design will have to conform to the power, performance, and (chip) area (PPA) matrix so that the product can be manufactured and deployed effectively.
At the HSIO links speed up to 50~56Gbps, most of the transceivers feature the following equalization schemes: transmitter FIR, receiver CTLE, analog DFE, and analog CDR. So let’s check the PPA matrix of these EQ schemes and their performance. The EQ structure is illustrated in Figure 6.
TX FIR
Performance: TX FIR is effective in compensating ISI but with restrictions. It can compensate both pre-cursor and post-cursor ISI but only at bit or baud level. The drawbacks of TX FIR are peak power constraints, where heavy equalization will reduce the effective output amplitude, and short tap length, where TX FIR is also usually limited to 2 pre-cursor and 2 post-cursor taps. The main issue of using TX FIR is the determination of FIR coefficients. For serial links without the support of back channels and associated protocols, TX FIR coefficients are fixed and need to be determined before deployment.
Power: TX FIR is efficient in power usage because clock time is usually readily available at the transmitter side. TX FIR can be implemented using analog circuit or using DAC.
Area: Similar to power consideration, there are few overheads in implementing TX FIR given the constraints mentioned above.
RX CTLE and VGA
Performance: CTLE plays an essential part of equalization for many reasons. CTLE is implemented using analog circuitry which, in theory, can match, or reverse, a channel’s loss characteristics in both pre-cursor and post-cursor parts. It not only improves signal-to-distortion ratio but it also can restore amplitude of the incoming signal. The weakness of CTLE includes: noise amplification where it worsens high frequency noises, characteristics variations caused by PVT (process, voltage, and temperature) variations, and inflexibility where it usually cannot adapt or change its characteristics after the design is finalized.
Power: CTLE is power efficient as it mostly operating in smaller signal region.
Area: Same as power, CTLE consumes small chip area.
RX Analog DFE
Performance: DFE can be implemented in the analog domain where the incoming waveform or signal is adjusted with past decisions and adapted DFE coefficients. DFE requires a CDR to function as it needs to know both the clock time and recovered data symbols. Then the compensation/adjustment needs to be computed and applied on the data path within single symbol unit interval time. As explained in the previous section, DFE can provide strong and long-tailed corrections due to its IIR structure with relatively short tap length. This attributes to an optimal design choice for data rate below 56Gbps.
Power and area: The stringent timing requirement plus the need to extend tap length have made analog DFE designs more challenging and less power and area efficient above 56Gbps data rates.
Other EQ Schemes
One may have noticed that RX FFE is not part of commonly seen EQ features. Let’s perform a PPA analysis and see how it fares.
Performance: While DFE is only capable of compensating post-cursor ISI, FFE is capable of handling both pre- and post-cursor channel effects. By nature, FFE is more flexible and can adapt to more channel characteristics than CTLE does. Analog FFE can be deployed in the receiver and be used as the main linear EQ or supplemental to the CTLE where it can adapt and compensate CTLE and a device’s PVT variations.
Power and area: Implementing analog FFE is challenging as it involves the use of analog delay lines. It is known that an analog delay line consumes power and it, like CTLE, is subject to PVT variations. For transceivers that support multiple protocols or data rates, one will need adjustable delay line designs which will make the situation more difficult. This is the key reason that analog FFE does not become a common product feature.
Next Generation EQ Architecture
As data rates increase, the HSIO device and component designers face an ever shrinking timing budget (assuming they use the same coding/modulation scheme or degrading SNR, and assuming the use of a higher level of coding scheme). Regardless of the approach, the demand of more capable EQ schemes persists.
The need to handle more complex channel characteristics and EQ schemes has pushed the device designers toward ADC-based receiver design. The advantages of ADC-based designs include the ability to use more elaborate equalization and adaptation schemes and the extension of FFE/DFE tap length. However, with the addition of ADC and DSP-based EQ designs, one should not neglect the impacts in the areas of additional area/power consumptions, bandwidth limitations, and complexity which come with this design approach (see Figure 7).
TX FIR
Same as previous generation design. The PPA of TX FIR remains unchanged.
RX CTLE and ADC
The role of CTLE stays the same as it is designed to perform ISI compensation. Why not use the ADC and DSP-based EQ to replace the CTLE? To answer this question, we have to look at how an ADC-based receiver works. An ADC is a sampling system that samples and digitalizes incoming signals at the symbol or sub-symbol rate. This means that we will need to know clock time with pre-determined ADC resolution. As most receivers have to recover the clock time from the incoming signal, having a CTLE will make the timing recovery quicker, easier and more stable. Further, CTLE will improve the SDR of the ADC input signal. This means that there is a relationship between the performance of CTLE and ADC’s resolution. It is known that ADC-based systems’ performance and efficiency depend heavily on the sampling rate and ADC resolution. Having a functional CTLE before ADC will greatly improve overall the RX’s PPA score.
The same argument also applies to VGA where it provides amplitude control of the incoming signal. Where CTLE improves SDR of the ADC input signal, VGA will improve SNR by matching the input signal dynamic range with ADC’s optimal operating range, i.e. ADC’s sensing scale and linearity.