# A Predictive SAR ADC Architecture Matias Jara<sup>®</sup>, Member, IEEE, and Behzad Razavi<sup>®</sup>, Fellow, IEEE Abstract—A SAR architecture is proposed that employs a predictive technique to increase the conversion speed. In this new technique, the comparator operates in parallel with the logic and DAC, reducing the SAR timing budget per cycle to only one comparator decision time plus its clock generation. Moreover, a clock and input signal distribution method is presented that improves clock phase matching in a time-interleaved system. This is accomplished by delivering the primary clock to each channel and generating their interleaved phases locally. Realized in 28-nm CMOS technology, a 6-bit 10-GS/s 17.6-mW prototype achieves an SNDR of 31.2 dB at an input frequency of 4.96 GHz and a figure of merit equal to 59 fJ per conversion step. Index Terms—Analog-to-digital converters (ADC), successive approximation (SAR), time interleaved, predictive, ring counter, clock distribution. #### I. Introduction THE successive-approximation-register (SAR) analog-to-digital converter (ADC) has efficiently served a wide range of resolutions and speeds for the past two decades. In order to raise the conversion rate beyond the basic 1-bit-per-cycle limit, a number of methods have been proposed [1], [2], [3], [4], each entailing certain trade-offs. Of course, time interleaving also proves effective, but at the cost of greater input capacitance and area. This work presents a new high-speed SAR architecture that increases the conversion rate by means of a "predictive" technique. Moreover, a method of clock and input signal distribution is introduced to suppress clock phase mismatches in an interleaved environment. Using these techniques, a 6-bit prototype fabricated in 28-nm technology runs at 10 GS/s while drawing 17.6 mW from a 0.8-V supply [5]. #### II. BACKGROUND Asynchronous operation [6] and top-plate sampling [7] are two approaches to increasing the speed. The former removes the timing margins of the comparator's decision time, while the latter eliminates the settling time of the digital-to-analog converter (DAC) for the most-significant-bit (MSB) cycle. Received 25 September 2024; revised 9 December 2024; accepted 28 December 2024. Date of publication 8 January 2025; date of current version 29 August 2025. This work was supported in part by Realtek Semiconductor; in part by the Agencia Nacional de Investigación y Desarollo (ANID) Scholarship, Chile; and in part by the Fulbright Scholarship. This article was recommended by Associate Editor P. Harpe. (Corresponding author: Matias Jara.) Matias Jara was with the Department of Electrical and Computer Engineering, University of California at Los Angeles, Los Angeles, CA 90095 USA. He is now with Broadcom Inc., Irvine, CA 92618 USA (e-mail: mgjara@ucla.edu). Behzad Razavi is with the Department of Electrical and Computer Engineering, University of California at Los Angeles, Los Angeles, CA 90095 USA (e-mail: razavi@ee.ucla.edu). Digital Object Identifier 10.1109/TCSI.2025.3525619 Fig. 1. Generic asynchronous SAR architecture. Consider the generic asynchronous SAR shown in Fig. 1, where a StrongArm comparator along with a NOR gate generates a self-clocking trigger ("Ready") signal. When X and Y depart from each other sufficiently, Ready goes high and, after a delay of $\Delta T$ , activates the comparator again. This delay time is chosen to be equal to the logic delay plus the worst-case settling time of the DAC: $\Delta T = T_{\text{logic}} + T_{\text{DAC,max}}$ . For cycle j, therefore, the loop requires $$T_{\text{SAR}, i} = T_{\text{comp}, i} + T_{\text{async}} + \Delta T,$$ (1) where $T_{\text{comp},j}$ depends on the voltage difference sensed by the comparator, and $T_{\text{async}}$ represents the delay of the self-triggered clock. We call the circuit consisting of $Comp_1$ , the NOR gate, and the $\Delta T$ the "digital" loop, and that created by the logic and the DAC the "analog" loop. The key point here is that the former must *not* be faster than the latter. This is to ensure that $Comp_1$ is clocked *after* the DAC has settled. It should be noted that the "desired" delay, $\Delta T$ , in Fig. 1 must be long enough to allow the DAC to settle. Assuming both asynchronous and top-plate sampling are used, we write the conversion time of an *N*-bit ADC as $$T_{\text{ADC}} = T_{\text{acq}} + \sum_{j=1}^{N-1} T_{\text{SAR},j} + T_{\text{async}} + T_{\text{comp,N}},$$ (2) where $T_{\text{acq}}$ denotes the acquisition time in the sampling mode, and the summation represents the N-1 asynchronous SAR cycles. A key limitation here is that no time borrowing or overlap can be created between $T_{\rm comp}$ , $T_{\rm logic}$ , and $T_{\rm DAC}$ , dictating a "serial" operation. For resolutions around 6 bits, these three components are comparable and must be independently optimized. In addition to asynchronous operation and interleaving, other methods have been investigated. For example, one can detect more than 1 bit per cycle [1], [3], [4], [8]. | Architecture | Comparison per cycle | DAC<br>settlings<br>per cycle | Type Timing budget per cycle | | Number of cycles for $N = 6$ | Comparator load | Area overhead<br>wrt async. SAR | |----------------------------|----------------------|-------------------------------|------------------------------|--------------------------------------------------------------------------------------|------------------------------|-----------------------------|---------------------------------| | Asynchronous SAR [6] | 1 | 1 | Async. | $t_{\rm async} + t_{\rm comp} + \max(t_{\rm logic} + t_{\rm DAC}, t_{\rm comp,rst})$ | 6 | SAR logic<br>+ Async. logic | 0 | | 2-bit-per-cycle [1] | 3 | 3 | Sync. | $t_{\rm comp} + t_{\rm logic} + t_{\rm DAC}$ | 3 | SAR logic | 2 comparators<br>2 DACs | | Alternate Comparators [10] | 1 | 1 | Async. | $t_{ m async} + t_{ m comp} + t_{ m logic} + t_{ m DAC}$ | 6 | SAR logic<br>+ Async. logic | 1 comparator | | Loop Unrolling [2] | 1 | 1 | Async. | $t_{\rm async} + t_{\rm comp} + t_{\rm DAC}$ | 6 | CDAC<br>+ Async. logic | 5 comparators | | Predictive (this work) | 1 | 2 | Async. | $t_{\rm async} + t_{\rm comp}$ | 6 | SAR logic<br>+ Async. logic | 3 comparator<br>3 DACs | TABLE I SUMMARY OF SAR PRIOR ART $T_{ m comp}$ : Comparator decision time. $T_{ m async}$ : Asynchronous clock generation delay. $T_{ m logic}$ : SAR logic delay. $T_{ m DAC}$ : DAC settling time. $T_{ m comp,rst}$ : Comparator reset time. Fig. 2. Simplified predictive concept. As another example, one can remove $T_{\rm logic}$ through the use of N comparators that are clocked in a domino fashion so as to obtain N bits [2], [9]. The principal drawback here is that the comparators' kickback noise accumulates on the capacitive DAC. Even if this kickback noise only introduces a common-mode component, it alters the comparator's offset from cycle to cycle, introducing errors in the SAR loop or misdirecting the offset calibration routine. Table I summarizes examples of SAR prior art properties for a 6-bit implementation. We should note that asynchronous operation does not significantly improve the speed of 2-bit-percycle ADCs [4]. The last row shows the speed improvement afforded by our proposed technique, as explained in the subsequent sections. #### III. PROPOSED ADC ARCHITECTURE #### A. The Predictive Concept A binary tree presents only two possible outputs on its nodes. Since the two are already defined, it is simply a matter of deciding which path to take based on a given operation. The same idea can be applied to 1-bit-per-cycle SAR ADCs: there are only two possible DAC output levels, and one is generated based on the comparator decision. We then surmise that *both* of the candidate levels can be computed beforehand, and one is used in the next cycle according to the comparator output. Illustrated in Fig. 2 for a 4-bit system, this "predictive" concept produces $V_{\text{REF}}/4$ and $3V_{\text{REF}}/4$ simultaneously in the Fig. 3. Predictive SAR direct implementation. first cycle. At the same time, Comp<sub>1</sub> is clocked, and its decision selects one of these two levels for the next comparison. We expect that the predictive technique reduces the SAR cycle time. We formulate the result in Section III-D. Let us now implement the concept in a generic SAR loop. We begin with the structure depicted in Fig. 3, where DAC<sub>1</sub> produces the level for the present cycle, and DAC<sub>2</sub> and DAC<sub>3</sub> those for the next. The MUX directs the output of DAC<sub>1</sub> to Comp<sub>1</sub> in the present cycle. The resulting decision commands the MUX to apply the output of DAC<sub>2</sub> or DAC<sub>3</sub> to Comp<sub>1</sub>. We illustrate the detailed operation with the aid of Fig. 4. In the first cycle, DAC<sub>1</sub> settles to $V_{\rm REF}/2$ and Comp<sub>1</sub> is clocked while the DAC<sub>2</sub> and DAC<sub>3</sub> outputs are traveling towards the candidate levels for the next cycle [Fig. 4(a)]. If $V_{\rm in} > V_{\rm REF}/2$ , the MSB is set to 1, and the cycle applies the output of DAC<sub>2</sub> to Comp<sub>1</sub> [Fig. 4(b)]. Conversely, if $V_{\rm in} < V_{\rm REF}/2$ , we have MSB = 0 and DAC<sub>3</sub> drives Comp<sub>1</sub>. The key point here is that the comparator decisions occur *in parallel* with DAC settling times. The foregoing realization of the predictive concept faces three drawbacks. First, $Comp_1$ , e.g., the StrongArm topology, cannot perform two evaluations without a reset phase in between, thus requiring an extra delay in the timing budget. Second, the multiplexing action depicted in Fig. 3 in fact corrupts the DAC output voltages due to the charge stored on the parasitic capacitance, $C_p$ , in the previous cycle. This memory effect leads to nonlinearity even if $C_p$ is perfectly linear, requiring that $C_p$ be less than 2% of the DAC capacitance for 6-bit linearity. Third, the logic must remember which <sup>&</sup>lt;sup>1</sup>It is assumed that the logic and the DACs in a multi-bit loop operate in parallel and hence have approximately the same delay as those in single-bit topologies. Fig. 4. Predictive SAR operation. (a) First cycle. (b) Second cycle if MSB equals one. (c) Second cycle if MSB equals zero. DAC has yielded a high or low level at the output of Comp<sub>1</sub>, thus demanding great complexity. #### B. Predictive SAR Conversion With Redundancy The three issues identified in the previous section can be alleviated by adding one more DAC and three more comparators. As shown in Fig. 5(a), the proposed architecture directly attaches each comparator to a DAC, forming a "sub-channel". It also incorporates two clock selectors so as to retime the main clocks, CK and $\overline{CK}$ , for driving each comparator according to the decisions of two others. We now describe the system's operation for two consecutive bits. Suppose, as shown in Fig 5(b), the predictive SAR loop begins with DAC<sub>1</sub> generating an output equal to $V_{\rm REF}/2$ . We allow the input clock, CK, to reach $CK_1$ and activate Comp<sub>1</sub>, obtaining the MSB. At the same time, DAC<sub>3</sub> and DAC<sub>4</sub> deliver the candidate values $3V_{\rm REF}/4$ and $V_{\rm REF}/4$ , respectively. In this phase, DAC<sub>2</sub> and Comp<sub>2</sub> are idle. In the next half cycle, the circuit transitions to the state shown in Fig. 5(c). If $X_1 = 1$ , $\overline{CK}$ travels to $CK_3$ while DAC<sub>1</sub> and DAC<sub>2</sub> respectively generate $7V_{\text{REF}}/8$ and $5V_{\text{REF}}/8$ in anticipation of the next cycle. Similarly, if $X_1 = 0$ , $\overline{CK}$ propagates to $CK_4$ , and DAC<sub>1</sub> and DAC<sub>2</sub> yield $3V_{\text{REF}}/8$ and $V_{\text{REF}}/8$ , respectively. Fig. 6 summarizes the ADC's conditional actions. As explained below, in the subsequent cycles, CK and $\overline{CK}$ become available internally due to asynchronous operation. The predictive loop continues until all the bits are obtained. We should point out several attributes of the proposed architecture. First, the presence of four DACs greatly simplifies the logic because DAC<sub>1</sub> and DAC<sub>3</sub> always evaluate if the current output bit is a ONE, and so do DAC<sub>2</sub> and DAC<sub>4</sub> if the bit is a ZERO. Thus, no multiplexing is necessary, and the clock selector simply senses the polarity of one comparator's output to produce a clock. #### C. Asynchronous Operation As with conventional SAR loops, the proposed predictive architecture can benefit from asynchronous operation. Such an endeavor must accommodate the clock selectors shown in Fig. 5. As explained in Section II, a conventional loop employing a StrongArm comparator can generate a Ready signal. In a predictive SAR environment, the same principle can be used. Each comparator is reset by a Ready signal, which is delivered by a neighboring comparator. Depicted in Fig. 7(a) is the realization, wherein the Ready signals provided by Comp<sub>1</sub> and Comp<sub>3</sub> are ANDed and then applied to the comparators. Suppose Comp<sub>1</sub> is in the reset mode, $X_1 = Y_1 = 0$ , and Ready<sub>1</sub> = 1. After Comp<sub>3</sub> makes a decision, $X_3$ rises [Fig. 7(b)] and travels to Comp<sub>1</sub>. Comp<sub>1</sub> then evaluates, forcing Ready<sub>1</sub> to zero and resetting itself. We next implement the clock selector, which from Fig. 5(a), must pass $\overline{CK}$ to $CK_3$ or $CK_4$ according to the logical values of $X_2$ and $X_1$ . Due to the asynchronous nature of the circuit, $\overline{CK}$ is not present anymore, requiring that the clock selector generate a pulse depending on the comparator's output level. The dynamic-logic structure shown in Fig. 8 accomplishes this task with minimal delay. The R input is initially high while the comparators are reset. Thus P=Q=1. After the ADC sampling phase is finished, R falls to zero, allowing either comparator to bring P or Q down and hence clock Comp<sub>3</sub> or Comp<sub>4</sub>. To initialize the asynchronous operation, a Start signal is added to the clock selector of $CK_1$ (Fig. 9). During the sampling phase, R=1, and a latch sets Start = 1. Once the sampling phase finishes, R falls, and the Start signal forces P to zero, triggering $CK_1$ and initializing the asynchronous operation. At the first rising edge of $CK_1$ , the latch returns Start to zero. #### D. Timing Considerations As mentioned in Section II, the "digital" loop in an asynchronous SAR ADC must not be faster than the "analog" loop. This restriction applies to our architecture as well, thus demanding proper budgeting of circuit delays. In the realization of Fig. 10(a), we can express these two loop delays as follows: $$T_{\text{digital}} = T_{\text{comp}} + T_{\text{AND}}$$ (3) $$T_{\rm analog} = T_{\rm logic} + T_{\rm DAC}.$$ (4) Fig. 5. Predictive SAR implementation with redundancy. (a) Predictive SAR with four compartors and four DACs. (b) First half-clock cycle. (c) Second half-clock cycle. (Actual implementation is fully differential). | Cycle 1 | Cycle 2, $X_1 = 1$ | Cycle 2, $X_1 = 0$ | |----------------------------------------------|-----------------------------------------------|--------------------------------------------------------| | $CK_1 = CK$ | $CK_3 = \overline{CK}$ | $CK_4 = \overline{CK}$ | | $V_{\rm DAC1} = \frac{V_{\rm REF}}{2}$ | $V_{\text{DAC1}} = \frac{7V_{\text{REF}}}{8}$ | $CK_4 = \overline{CK}$ $V_{DAC1} = \frac{3V_{REF}}{8}$ | | $V_{\rm DAC2} = \frac{V_{\rm REF}}{2}$ | $V_{\rm DAC2} = \frac{5V_{\rm REF}}{8}$ | $V_{\rm DAC2} = \frac{V_{\rm REF}}{8}$ | | $V_{\rm DAC3} = \frac{3V_{\rm REF}}{4}$ | $V_{\text{DAC3}} = \frac{3V_{\text{REF}}}{4}$ | $V_{\text{DAC3}} = \frac{3V_{\text{REF}}}{4}$ | | $V_{\text{DAC4}} = \frac{V_{\text{REF}}}{4}$ | $V_{DAC4} = \frac{V_{REF}}{4}$ | $V_{\text{DAC4}} = \frac{V_{\text{REF}}}{4}$ | Fig. 6. Predictive SAR conditional actions. We should remark that the logic typically consists of a "pointer" register (a shift register for keeping track of the bit under consideration in a given SAR cycle) and a DAC register (which holds the data for the DAC) [Fig. 10(a)]. We denote their delays by $T_P$ and $T_D$ , respectively. That is, $$T_{\rm analog} = T_P + T_D + T_{\rm DAC}. \tag{5}$$ Such preliminary implementation incurs an excessively large value for $T_{\rm analog}$ . Plotted in Fig. 10(c) are the simulated results in this case (including layout parasitics), revealing that $T_{\rm digital} < T_{\rm analog}$ . In order to meet the timing condition, we recognize that the pointer can be clocked in the *previous* cycle. Given the finite delay of the pointer, no race condition is observed, and we can simply bypass the pointer in the analog path [Fig. 11(a)]: We now have $$T_{\rm analog} = T_D + T_{\rm DAC},$$ (6) obtaining the results shown in Fig. 11(c) and guaranteeing that $T_{\text{digital}} > T_{\text{analog}}$ . These observations yield a total cycle time equal to $T_{\text{comp}} + T_{\text{async}}$ , where $T_{\text{async}}$ refers to the delay of the ready generator, which is equal to $T_{\rm AND}$ in our example. This result stands in contrast to the cycle times listed in Table I for prior architectures. To provide a quantitative comparison, we have developed the following approach: From transistor-level simulations (including layout parasitics), we can compute the delay of the comparator as a function of its input difference, the delay of the logic, the delay of the ready-signal detector, and the settling time of the DAC as a function of its digital input. Using the expressions for the timing budget per cycle in Table I, an average conversion time and power consumption are calculated and shown in Fig. 12. Note that for the DAC time calculation, the worst case needs to be considered for each SAR cycle so as to guarantee correct settling within the fixed delay $(\Delta T)$ in the asynchronous loop. With a 0.8-V supply, on average, our predictive SAR technique is 27% faster than loop unrolling and 40% faster than alternate comparators. Another key point here is that the predictive operation greatly relaxes the comparator reset time. In finer technology nodes, the overall speed is expected to improve. In Eq. (6), both the DAC register delay, $T_{\rm D}$ , and the DAC settling time, $T_{\rm DAC}$ , decrease. Similarly, in Eq. (3), the comparator response, $T_{\rm comp}$ , and $T_{\rm NOR}$ fall. So long as the condition $T_{\rm digital} > T_{\rm analog}$ is met, the proposed architecture can achieve a higher speed. Regarding the power penalty of using extra CDACs, the predictive technique increases the power consumption by 66% with respect to loop unrolling and by 28% with respect to alternate comparators. # E. DAC Design The use of four DACs in the proposed architecture could potentially lead to significant input capacitance and area penalties. This is even more critical when a large number Fig. 7. Asynchronous operation. (a) Realization in a predictive SAR environment. (b) Waveforms. of interleaved slices are required in modern communications links. The DAC unit capacitance can be as small as 30 aF for negligible kT/C noise, suggesting that the lower bound is dictated by matching. Based on prior work [11], we surmise that unit values below 1 fF still offer acceptable matching and select a $1-\mu m \times 1-\mu m$ , 0.43-fF parallel-plate structure consisting of metal 5, metal 6, and metal 7 [Fig 13(a)]. The "cage" thus created minimizes fringe capacitances to the surrounding geometrics [7]. To minimize the area, the DAC switches can be placed below each capacitor [Fig. 13(b)]. At the same time, this approach guarantees that each unit capacitor has the same switch resistance seen by the reference. This scheme can be modeled as the equivalent RC network shown in Fig. 15, where $C_p$ represents the parasitic capacitance of the DAC to ground, R the unit switch resistance, C the unit capacitance, and k the conversion cycle. The transfer function can be expressed as: $$H(s) = \frac{V_{\text{out}}}{V_{\text{in}}} = \frac{2^k C}{2^N C + C_p} \left(\frac{1}{s\tau + 1}\right)$$ (7) Fig. 8. Circuit implementation of clock selector. Fig. 9. Initialization of asynchronous operation. where the time constant is given by: $$\tau = RC \left( \frac{C_p}{C_p + 2^N C} \right) \tag{8}$$ Fig. 10. Timing considerations. (a) SAR logic's critical path. (b) Waveforms. (c) Simulated delays for analog and digital loops with timing violation. Fig. 11. Pointer register's bypassing. (a) Analog and digital loops. (b) Waveforms. (c) Simulated delays for analog and digital loops with timing margin. Note that if $C_p$ is zero, the DAC has a zero time constant, resulting in a sharp transition. Nevertheless, the actual time constant is rather small considering that it is lower than RC. Therefore, placing a switch on each unit capacitor does not incur any settling penalty. Each 6-bit binary DAC employs 64 units [Fig. 14(a)] (and surrounds them by dummy cells). The routing of the data and reference lines to the switches is illustrated in Fig. 14(b). The complete symmetry minimizes deterministic mismatches. Shown in Fig. 14(c) are the binary groupings of the units. The area occupied by all of the DACs is 8.7% of the overall ADC area. The DAC has an input capacitance of 32.3 fF and a footprint of $8~\mu m \times 24~\mu m$ . This yields a total ADC input capacitance of 130 fF, where 85% corresponds to the CDAC and 15% to top-plate routing, gate capacitance of the comparators, and drain-source capacitance of the sampling switch. The transient currents drawn by the DACs from the analog input and the reference merit attention. The former occurs when the four DACs are in the sampling mode. This effect Fig. 12. Comparison among SAR speed-enhancement techniques for 6-bit operation with a 0.8-V supply. (a) Estimated conversion time (effect of metastability at very small inputs is neglected). (b) Estimated power consumption. Fig. 13. DAC design. (a) Unit cell. (b) Switches placement. is quantified by the total input capacitance, 130 fF (including routing), and the 50- $\Omega$ source impedance, yielding a time constant of 6.5 ps and suggesting a fast recovery. Alternatively, a source follower consuming 1 or 2 mA can serve as a buffer here. The currents drawn from the reference must be studied by noting that, in the predictive architecture, only two DACs are active in a given SAR cycle. According to simulations, they pull a peak current of 8.75 mA, which is provided by an on-chip capacitance tied to the reference. #### F. Comparator Design As shown in Fig. 16, the dynamic comparator used in this design corresponds to a StrongArm latch circuit [12] where Fig. 14. DAC Array. (a) 6-bit binary DAC. (b) Unit capacitance's routing. (c) Binary grouping. Fig. 15. CDAC equivalent RC network. $M_{8-11}$ serve as capacitors for offset correction. The design methodology is based on [13], where the widths of $M_{1,2}$ are set to obtain a maximum input-referred offset that can be corrected by the offset cancellation scheme. The input capacitance of the comparator is thus minimized, reducing its contribution to the CDAC gain error and decreasing any kickback noise generated by $M_7$ . The predictive SAR uses four different comparators to quantize a signal. However, only *one* comparator is activated per cycle, suggesting that the noise contribution of the comparator to the overall SNR is identical to a typical implementation. The simulated noise of the comparator is 2.1 mVrms; for a 6-bit implementation, it degrades the SNR by 0.8 dB SNR. The circuit consumes 340 $\mu$ W at a clock rate of approximately 10 GHz. # G. Effect of Nonidealities The four comparators and DACs employed in the proposed architecture exhibit offset mismatches and gain mismatches, respectively. In this section, we address these issues. Fig. 16. StrongArm Comparator. 1) Offset Mismatch: Consider the simplified diagram shown in Fig. 17, where $V_{\rm os1}$ and $V_{\rm os3}$ denote the actual offsets of Comp<sub>1</sub> and Comp<sub>3</sub>, respectively. If the DACs are ideal, the net input sensed by the comparators in two consecutive cycles suffers from an inconsistency of $V_{\rm os1}-V_{\rm os3}$ . This translates to a differential nonlinearity of the same amount, degrading the signal-to-noise ratio. The StrongArm design in this work displays an offset of $\pm 30~{\rm mV} \equiv \pm 1.875~{\rm LSB}$ , requiring calibration. We implement both fine and coarse offset calibration. The former is based on attaching programmable capacitors to the comparator's internal nodes [Fig. 18(a)] [14], providing a range of 9 mV in steps of 1.5 mV. A wider range is desirable, but it would substantially reduce the speed and increase the power consumption of the circuit. We therefore introduce another mechanism for coarse tuning by recognizing that the DAC contains an "idle" unit capacitor. Illustrated in Fig. 18(b), the idea is to switch the bottom plate of this unit between $V_{\rm REF}^+$ and $V_{\rm REF}^-$ , so as to create a step equal to $\pm 0.5$ LSB at X. This method entails no penalty and greatly relaxes the design of the comparator. The coarse correction provides a step of $\pm$ 0.5 LSB, yielding a total range of $\pm$ 1.3 LSB $\equiv$ $\pm$ 20 mV. In view of the $3\sigma$ offset of $\pm$ 30 mV, this range suffices for 95% of cases and trades with the comparator's decision time, given that to create a greater offset, extra capacitors are needed on their internal nodes [Fig. 18(a)]. For product design, one should target a finer correction. The offset calibration routine operates as follows: For each comparator, we begin with the most negative built-in offset while setting the differential input to zero. Then, the comparator is clocked and its output is observed: if it is zero, the built-in offset increments by one unit and the test is repeated, if it is one, the built-in offset value is held and the calibration is completed. 2) Gain Mismatch: The parasitic capacitances at the DACs' outputs introduce a gain error, but it is their mismatch that leads to DNL. As a worst-case scenario, suppose DAC<sub>3</sub> and DAC<sub>4</sub> in Fig. 5(a) experience a top-plate parasitic equal to $C_p$ while DAC<sub>1</sub> and DAC<sub>2</sub> incur none. Consequently, the codes between $+V_{\rm REF}/2$ and $V_{\rm REF}$ as well as those between Fig. 17. Offset mismatch between comparators. Fig. 18. Comparator offset cancellation. (a) Fine tuning. (b) Coarse tuning. $-V_{\text{REF}}/2$ and $-V_{\text{REF}}$ suffer from gain error [Fig. 19(a)]. The DNL at code number n is expressed as $$DNL_n = \frac{W_n}{LSB} \frac{1 - \alpha}{\alpha},\tag{9}$$ where $W_n$ is the width between transitions n and n+1 and $\alpha = 1 + C_p/C_{DAC}$ . The maximum deviation occurs at the last transition, yielding $$DNL_n = \left(2^{N-1} - 1\right) \left(\frac{1 - \alpha}{\alpha}\right),\tag{10}$$ where N is the resolution. For example, a DNL of 0.5 LSB with N=6 translates to $C_p/C_{\rm DAC}=1.64\%$ . Viewing this value as the maximum tolerable mismatch and hence equal to $\pm 3\sigma$ , we estimate $\sigma=0.27\%$ . For a more rigorous analysis, we must resort to Monte Carlo simulations. Unfortunately, foundry design kits typically do not provide mismatch statistics for *parasitic* capacitances. Fig. 19. Gain mismatch. (a) Gain error due to top-plate parasitic mismatch. (b) Tolerable top-plate parasitic mismatch predicted by Monte Carlo simulations. Fortunately, the parasitics mostly arise from fringe components, whose mismatch can be predicted from the statistics of the fringe capacitors in the design kit. Using this approach, we arrive at the tolerable top-plate parasitic plots shown in Fig. 19(b). The foregoing analysis results reveal two points. First, the simple model described above indeed offers a reasonable prediction. Second, according to Fig. 19(b), a 6-bit ADC demands $\Delta C_p/C_{\rm DAC}=0.3\%$ , which is feasible without calibration. # H. Comparison With 2-Bit-per-Cycle SARs The predictive SAR technique offers speed enhancement while employing multiple comparators and DACs, similar to a 2-bit-per-cycle topology. The main difference is that the predictive SAR architecture requires only *one* comparator decision and *two* DAC settling times per conversion cycle. A 2-bit-per-cycle loop, on the other hand, demands *three* comparator decisions and *three* DAC settling times per conversion cycle (Table I). These observations point to two advantages of our proposed architecture with respect to 2-bit-per-cycle SARs. First, since we switch two DACs in a given cycle—rather than three—we allow a higher output impedance for the DAC reference Fig. 20. Look-ahead logic in [15]. Fig. 21. Sampling disturbance for overlapping clocks. generator. Second, the use of only one comparator reduces the loading presented to the logic in the clock path, providing sharper edges. To resolve 6 bits, a 2-bit-per-cycle SAR takes three synchronous cycles, whereas ours requires six asynchronous cycles. However, we recognize that the former triggers nine comparators and the latter only six. In terms of DAC power efficiency, we note that the former switches the DAC nine times and the latter ten times. # I. Comments on Prior Art We wish to emphasize key differences between the technique proposed in [16] and [17] and ours. The approach, as described in [15], helps to reduce the time between the comparator's decision time and the DAC settling. This is achieved by directly using the output of the comparator to select the tap of a resistive ladder that acts as a DAC [Fig. 20]. This method appears to null the effective logic delay but does not address the DAC settling time (through $M_{0,1}$ and $C_X$ ). Our technique, on the other hand, reduces the overall loop delay to $T_{\rm comp} + T_{\rm NOR}$ [Eq. (3)]. #### IV. INTERLEAVED ADC IMPLEMENTATION For a linear speed-power trade-off, the 6-bit predictive SAR architecture can operate up to a sampling rate of 1.25 GS/s. To maintain such trade-off at higher speeds, we resort to interleaving, while bearing in mind the overhead power consumed by clock generation and distribution. Fig. 22. Clock and signal distribution. (a) Local clock generation. (b) Ring counter in each channel. (c) Latch realizations with set $(L_{1-2})$ and reset $(L_{3-8})$ inputs. Fig. 23. Simulated propagation delay of input and clock transmission lines. This work incorporates eight interleaved channels. The decision between overlapping or non-overlapping clocks entails two issues. First, the former raises the overall ADC input capacitance by as much as a factor of 4. Second, and more TABLE II POWER BREAKDOWN | Bootstrap Switches + Comparators | 3.5 mW | |---------------------------------------------------|---------| | SAR Logic + CDAC Switching Scheme | 6.1 mW | | Clock Generation + Clock Buffers + Clock Selector | 8.0 mW | | Total | 17.6 mW | importantly, in the presence of overlap, when one channel begins to sample, it creates a large disturbance at the input that takes some time to subside and may corrupt the value sampled by another channel. As depicted in Fig. 21, the voltage sampled at $t = t_b$ by $CK_1$ is affected by the $CK_2$ activity at $t_a$ . For these reasons, we opt for non-overlapping clocks with a duty cycle of 12.5%. Fig. 24. Die photograph. (a) Interleaved ADC. (b) Single-channel SAR. Fig. 25. Measured DNL and INL after calibration. However, the distribution of clocks with a pulsewidth of 100 ps presents daunting challenges. The pulsewidth shrinks considerably as the waveform travels on long interconnects. Moreover, the clock phases experience large mismatches in such a distribution network. For example, a conventional H-tree structure inevitably imposes an asymmetric environment for the eight phases, thereby introducing deterministic mismatches. According to simulations of the extracted layout, if eight channels are arranged in two rows, each 300 $\mu$ m long, an H-tree carrying four phases to each row would incur a mismatch of 5 ps between the first phase and the third phase. To address these issues, we propose a clock and signal distribution scheme that employs two horizontal transmission lines to match the propagation delay of the clock and the input signal across the interleaved channels. Illustrated in Fig. 22(a), the idea is to distribute only the differential phases of the 5-GHz clock across the eight channels and produce the necessary phase (with a 12.5% duty cycle) within each channel. Simultaneously, the differential input signal travels in Fig. 26. ADC output spectrum of a 1-V<sub>pp,dif</sub> sinusoidal sampled at 10 GS/s, and input frequency of (a) 153 MHz before offset calibration, (b) 153 MHz after offset calibration, and (c) 4.96 GHz (data downsampled by a factor of 625). At 4.96 GHz, downsampled tone frequencies are equal to $f_1 = f_{\rm in} - 310f_{\rm s}/625$ , $f_2 = 621f_{\rm s}/625 - 2f_{\rm in}$ , $f_3 = 931f_{\rm s}/625 - 3f_{\rm in}$ , $f_4 = 4f_{\rm in} - 1241f_{\rm s}/625$ , and $f_5 = 5f_{\rm in} - 1551f_{\rm s}/625$ . the same direction as the differential clock, so both experience the same delay as they reach each ADC channel. To obtain a 12.5% duty cycle, one can use a 10-GHz clock, a chain of $\div 2$ , and logic. We instead opt for a 5-GHz clock so as to relax the global distribution issues and reduce the power consumption. This task is realized by a ring counter. To ensure that the counters begin in proper order, an external synchronization command, $S_{\rm ext}$ , is applied at power-up. This action does entail a few issues that are described below. Figure 22(b) depicts the ring counter implementation, which delivers outputs $X_1$ - $X_8$ with a 25% duty cycle. An AND gate senses $X_j$ and CK to convert the duty cycle to 12.5%. The retiming action also removes the counter's jitter contribution, relaxing the design of the latches. In this work, the latches are realized by clocked CMOS (C<sup>2</sup>MOS) logic [Fig. 22(c)] with nearly-minimum-size transistors, so that each counter occupies a footprint of 25 $\mu m \times$ 6 $\mu m$ . The set and reset inputs force a ONE or a ZERO at the latch output, respectively, while disabling the clock path. We should now make three remarks. First, the synchronization of the counter resets in Fig. 22(a) requires that all ring counters be released from reset in the same clock cycle. For this reason, two flipflops in this path lower the metastability rate. Specifically, $FF_G$ in Fig. 22(a) deals with the long transition time of the global reset, $S_{\rm ext}$ , and synchronizes it with the 5-GHz clock. The output of this flipflop still experiences distortion as it travels to the channels. Thus, $FF_L$ in Fig. 22(a) locally retimes this command. Second, the power consumed by the eight counters at 5 GHz is equal to 1.7 mW, which is, in fact, *lower* than the conventional case of distributing eight 1.25-GHz clocks. The buffers necessary for driving eight lines would draw 2.2 mW. Thus, the proposed scheme does not incur a power penalty. Third, even though $V_{\rm in}$ and $CK_{\rm in}$ travel in the same direction in Fig. 22(a), their phase difference varies from 0.4 ps for Channel<sub>1</sub> to 0.9 ps for Channel<sub>8</sub> (Fig. 23). When the sampling switches are open, the clock and signal delays are equal as they travel through the array. This is possible because the clock buffer and the sampling switch have similar input capacitances. However, when one sampling switch is closed, the line sees a much larger capacitance, exhibiting a longer delay. The fact that this extra load is not distributed through the line means it cannot be guaranteed that each ADC channel will experience the same delay. Thus, a deterministic difference arises. This 0.5-ps error translates to an interleaving phase mismatch. We deal with this effect and random mismatch by inserting a variable-delay line after each counter. This 7-bit delay line provides a delay range of $\pm 5$ ps with a resolution of 100 fs. The residual random or systematic skew is therefore 100 fs. As mentioned above, one SAR channel's power-speed tradeoff remains linear up to about 1.25 GS/s, naturally leading to an interleaving factor of 8. Nonetheless, one may argue that a slower architecture along with 32x interleaving can be as competitive because both architectures have approximately the same total number of comparators and CDACs and similar input capacitance. However, increasing the number of channels to 32 will inevitably face a much more complex clock, signal, and reference distribution, and will be prone to a larger mismatch. Moreover, in order to have the same efficiency point, the power consumption of the comparator must be reduced by 4x, inevitably doubling the comparator input noise voltage and degrading the SNR. #### V. EXPERIMENTAL RESULTS The eight-channel interleaved ADC has been fabricated in TMSC's 28-nm CMOS technology. Figures 24(a) and 24(b) show the overall die and a closeup for one channel. The DAC reference voltages are shared among all of the channels without any buffers. Measured with a 0.8-V supply, the circuit blocks draw the power numbers listed in Table II at a sampling rate of 10 GS/s. The results presented here were obtained after correcting the comparator's offset and interchannel mismatches. The former Fig. 27. Measured SNDR and SFDR as a function of input frequency sampled at 10 GS/s. Fig. 28. ADC performance up to a sampling rate of 16 GS/s and input frequency at Nyquist. TABLE III PERFORMANCE SUMMARY AND COMPARISON WITH PRIOR ART | | JSSC'13<br>[10] | ISSCC'14<br>[19] | ISSCC'15<br>[20] | VLSI'16<br>[21] | CICC'19<br>[22] | This<br>Work | |-------------------------|-----------------|------------------|------------------|-----------------|-----------------|--------------| | Topology | SAR | TI-SAR | TI-SAR | TI-Flash | TI-TDC | TI-SAR | | Resolution (bit) | 8 | 6 | 6 | 6 | 7.3 | 6 | | Number of Chanels | 1 | 8 | 32 | 2 | 2 | 8 | | SNDR (dB) @ Low Freq. | _ | _ | _ | 30.7 | 40.7 | 34.5 | | SNDR (dB) @ Nyquist | 39.3 | 33.8 | 30.3 | 29.4 | 32.5 | 31.2 | | Sampling Rate (GS/s) | 1.2 | 10 | 10 | 10.3 | 10 | 10 | | Power (mW) | 3.06 | 32.0 | 79.0 | 95.0 | 29.7 | 17.6** | | FoM (fJ/cs) @ Low Freq. | _ | _ | _ | 330 | 33 | 41 | | FoM (fJ/cs) @ Nyquist | 34 | 81 | 395.6 | 382 | 86 | 59 | | Technology (nm) | 32 | 28* | 65 | 28 | 65 | 28 | | Supply (V) | 1 | 1 | 1 | 1 | 1 | 0.8 | | Area (mm <sup>2</sup> ) | 0.0015 | 0.009 | 0.81 | _ | 0.14 | 0.14 | \*28-nm UTBB FDSOI Technology \*\*19.7 mW including inter-channel offset and gain digital calibration $FOM = \frac{Power}{2^{ENOB} f_s} \qquad ENOB = \frac{SNDR - 1.76}{6.02}$ TABLE IV COMPARISON WITH SINGLE-CHANNEL SAR CONVERTERS | | JSSC'13<br>[10] | ISSCC'24<br>[23] | ISSCC'17<br>[24] | ToMTT'23<br>[25] | This<br>Work | |-----------------------|-----------------|------------------|------------------|------------------|--------------| | Resolution (bit) | 8 | 8 | 7 | 7 | 6 | | Technology (nm) | 32 | 28 | 28 | 22 | 28 | | Input Frequency (GHz) | 0.48 | 0.50 | 0.55 | 0.85 | 0.51 | | Sampling Rate (GS/s) | 1.20 | 1.20* | 1.20* | 1.70 | 1.25* | | SNDR (dB) | 39.3 | 44.3** | 40.0** | 41.0 | 33.4 | | Power (mW) | 3.06 | 1.93* | 2.50* | 1.38 | 2.20* | | FoM (fJ/cs) | 34.00 | 12.00 | 25.30 | 8.85 | 46.57 | | Max. DNL (LSB) | +0.79 | -0.31/+0.29 | +0.49 | -0.51/+0.42 | -0.29/+0.39 | | Max. INL (LSB) | +0.91 | -0.55/+0.83 | +0.57 | -0.27/+0.29 | -0.46/+0.38 | \*Metric per channel. \*\*Estimated from plot. is carried out using an on-chip loop. Timing mismatches are removed by tuning the on-chip delay lines through the serial bus. Interchannel offset and gain mismatches are calibrated Fig. 29. Comparison with recent converters. (a) Jitter aperture. (b) Schreier's figure of merit versus speed. in the digital domain (off-chip). The estimated power consumption and area for the digital calibration are 2.1 mW and 0.011 mm<sup>2</sup>, respectively. This calculation is based on [18], where in our implementation, a 1.25-GHz core with 12-bit coefficients is sufficient for gain and mismatch correction. Unless otherwise stated, the sampling rate is 10 GS/s. Plotted in Fig. 25 are the measured DNL and INL profiles. Revealing peak values of +0.39/-0.29 LSB and +0.38/-0.46 LSB, respectively. The peak DNL and INL decrease by 0.29 LSB and 0.64 LSB, respectively, after correction. The SNDR rises by 3.11 dB. The output spectra for input frequencies equal to 153 MHz and 4.96 GHz are shown in Fig. 26. The ADC achieves an SNDR of 34.5 dB at low frequencies and 31.2 dB at Nyquist. For the latter, we estimate that the quantization noise, thermal noise, the harmonics, and the clock jitter contribute 21.8%, 25.7%, 5.7%, and 46.8%, respectively, to the denominator of SNDR. The corresponding SFDR values are 46.1 dB and 44.5 dB. The figure of merit (FOM) ranges from 41 fJ/cs to 59 fJ/cs. Figure 27 plots the measured SNDR and SFDR as a function of the input frequency. We note that the SNDR remains above 30 dB and the SFDR above 42 dB across the entire first Nyquist zone. The higher SFDR in the vicinity of 2.5 GHz is attributed to the resonance of input bond wires and/or the sharper selectivity of the off-chip filters used to remove the higher harmonics of the RF signal generator. While targeting a speed of 10 GHz, the design can, in fact, reach sampling rates as high as 16 GHz for $V_{DD} = 0.85$ V, albeit at the cost of greater power. Depicted in Fig. 28 are the SNDR and power plots for Nyquist-rate operation. Table III compares the performance of our prototype to that of the prior art in the resolution range of 6 to 7 bits and at a sampling rate of 10 GS/s. Compared to the state of the art, our implementation can achieve the same sampling rate operating only with a 0.8-V supply, demonstrating the effectiveness of the predictive architecture. In fact, the speed can be further increased if the voltage supply is raised; however, it becomes less efficient (Fig. 28). The majority of the power comes from the SAR logic to ensure that $T_{\text{analog}} < T_{\text{digital}}$ (Table II). This can be improved by using techniques that relax the DAC settling, such as redundancy [26]. Table IV makes a comparison among recent single-channel SARs with sampling rates near 1.25 GS/s. It should be noted that our single-channel design faces tighter trade-offs as it targets use in a high-speed interleaved system. Moreover, the measured SNDR of 33.4 dB for a single channel is affected by the kickback noise of other channels. We also attribute some FoM improvement in [25] due to 22-nm FD-SOI technology. A comparison with other converters reported in recent conferences [27] is shown in Fig. 29. #### VI. CONCLUSION This work introduces a predictive SAR architecture that offers a speed improvement. In addition, we propose a method of clock and input signal distribution and generation that alleviates pulse shrinkage and phase mismatch issues. These concepts are realized in a 6-bit 10-GS/s ADC, achieving a FOM of 59 fJ/cs at Nyquist. # ACKNOWLEDGMENT The authors would like to thank the TSMC University Shuttle Program for chip fabrication. # REFERENCES - Z. Cao, S. Yan, and Y. Li, "A 32 mW 1.25 GS/s 6b 2b/Step SAR ADC in 0.13 μm CMOS," *IEEE J. Solid-State Circuits*, vol. 44, no. 3, pp. 862–873, Mar. 2009. - [2] T. Jiang, W. Liu, F. Y. Zhong, C. Zhong, K. Hu, and P. Y. Chiang, "A single-channel, 1.25-GS/s, 6-bit, 6.08-mW asynchronous successive-approximation ADC with improved feedback delay in 40-nm CMOS," IEEE J. Solid-State Circuits, vol. 47, no. 10, pp. 2444–2453, Oct. 2012. - [3] C.-H. Chan, Y. Zhu, S.-W. Sin, S.-P. Ben, and R. P. Martins, "A 6 b 5 GS/s 4 interleaved 3 b/cycle SAR ADC," *IEEE J. Solid-State Circuits*, vol. 51, no. 2, pp. 365–377, Feb. 2016. - [4] C.-H. Chan, Y. Zhu, W.-H. Zhang, and R. P. Martins, "A two-way inter-leaved 7-b 2.4-GS/s 1-then-2 b/cycle SAR ADC with background offset calibration," *IEEE J. Solid-State Circuits*, vol. 53, no. 3, pp. 850–860, Mar. 2018. - [5] M. Jara and B. Razavi, "A 6-bit 10-GS/s 17.6-mW CMOS ADC with 0.8-V supply," in *Proc. IEEE Custom Integr. Circuits Conf. (CICC)*, Apr. 2023, pp. 1–2. - [6] S.-W.-M. Chen and R. W. Brodersen, "A 6-bit 600-MS/s 5.3-mW asynchronous ADC in 0.13-\(\mu\)m CMOS," IEEE J. Solid-State Circuits, vol. 41, no. 12, pp. 2669–2680, Dec. 2006. - [7] C.-C. Liu, S.-J. Chang, G.-Y. Huang, and Y.-Z. Lin, "A 10-bit 50-MS/s SAR ADC with a monotonic capacitor switching procedure," IEEE J. Solid-State Circuits, vol. 45, no. 4, pp. 731–740, Apr. 2010. - [8] H. Wei et al., "An 8-b 400-MS/s 2-B-per-cycle SAR ADC with resistive DAC," *IEEE J. Solid-State Circuits*, vol. 47, no. 11, pp. 2763–2772, Nov. 2012. - [9] L. Chen, K. Ragab, X. Tang, J. Song, A. Sanyal, and N. Sun, "A 0.95-mW 6-b 700-MS/s single-channel loop-unrolled SAR ADC in 40nm CMOS," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 64, no. 3, pp. 244–248, Mar. 2017. - [10] L. Kull et al., "A 3.1 mW 8b 1.2 GS/s single-channel asynchronous SAR ADC with alternate comparators for enhanced speed in 32 nm digital SOI CMOS," *IEEE J. Solid-State Circuits*, vol. 48, no. 12, pp. 3049–3058, Dec. 2013. - [11] A. Verma and B. Razavi, "Frequency-based measurement of mismatches between small capacitors," in *Proc. IEEE Custom Integr. Circuits Conf.*, Sep. 2006, pp. 481–484. - [12] B. Razavi, "The StrongARM latch [a circuit for all seasons]," IEEE Solid State Circuits Mag., vol. 7, no. 2, pp. 12–17, Jun. 2015. - [13] B. Razavi, "The design of a comparator [the analog mind]," *IEEE Solid-State Circuits Mag.*, vol. 12, no. 4, pp. 8–14, Nov. 2020. - [14] M.-J. E. Lee, W. J. Dally, and P. Chiang, "Low-power area-efficient high-speed I/O circuit techniques," *IEEE J. Solid-State Circuits*, vol. 35, no. 11, pp. 1591–1599, Nov. 2000. - [15] S. Louwsma, A. van Tuijl, and B. Nauta, *Time-interleaved Analog-to-Digital Converters*. Berlin, Germany: Springer, Oct. 2010, doi: 10.1007/978-90-481-9716-3. - [16] S. M. Louwsma, A. J. M. van Tuijl, M. Vertregt, and B. Nauta, "A 1.35 GS/s, 10 b, 175 mW time-interleaved AD converter in 0.13 μm CMOS," *IEEE J. Solid-State Circuits*, vol. 43, no. 4, pp. 778–786, Apr. 2008. - [17] F. Kuttner, "A 1.2 V 10b 20MSample/s non-binary successive approximation ADC in 0.13 μm CMOS," in *Proc. IEEE Int. Solid-State Circuits Conference. Dig. Tech. Papers*, vol. 1, Feb. 2002, pp. 176–177. - [18] L. Ricci et al., "A 2-GS/s time-interleaved ADC with embedded background calibrations and a novel reference buffer for reduced inter-channel crosstalk," *IEEE J. Solid-State Circuits*, early access, Aug. 12, 2024, doi: 10.1109/JSSC.2024.3437168. - [19] S. Le Tual, P. Narayan Singh, C. Curis, and P. Dautriche, "A 20GHz-BW 6b 10GS/s 32mW time-interleaved SAR ADC with master T&H in 28 nm UTBB FDSOI technology," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2014, pp. 382–383. - [20] A. Shafik, E. Z. Tabasy, S. Cai, K. Lee, S. Hoyos, and S. Palermo, "A 10Gb/s hybrid ADC-based receiver with embedded 3-tap analog FFE and dynamically-enabled digital equalization in 65 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2015, pp. 1–3. - [21] B. Raghavan et al., "A 125 mW 8.5–11.5 Gb/s serial link transceiver with a dual path 6-bit ADC/5-tap DFE receiver and a 4-tap FFE transmitter in 28 nm CMOS," in *Proc. IEEE Symp. VLSI Circuits (VLSI-Circuits)*, Jun. 2016, pp. 1–2. - [22] M. Hassanpourghadi and M. S. Chen, "A 2-way 7.3-bit 10 GS/s time-based folding ADC with passive pulse-shrinking cells," in *Proc. IEEE Custom Integr. Circuits Conf. (CICC)*, Apr. 2019, pp. 1–4. - [23] Y. Tao, M. Gu, B. Chi, Y. Zhong, L. Jie, and N. Sun, "22.4 a 4.8GS/s 7-ENoB time-interleaved SAR ADC with dither-based background timing-skew calibration and bit-distribution-based background pingpong comparator offset calibration," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, vol. 67, Feb. 2024, pp. 394–396. - [24] C.-H. Chan, Y. Zhu, I.-M. Ho, W.-H. Zhang, U. Seng-Pan, and R. P. Martins, "16.4 a 5 mW 7b 2.4GS/s 1-then-2b/cycle SAR ADC with background offset calibration," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 282–283. - [25] S. Buhr, C. D. Matthus, M. M. Khafaji, and F. Ellinger, "A 1.38-mW 7-bit 1.7-GS/s single-channel loop-unrolled SAR ADC in 22-nm FD-SOI with 8.85 fJ/Conv.-step for GHz mobile communication and radar systems," *IEEE Trans. Microw. Theory Techn.*, vol. 71, no. 9, pp. 3841–3851, Sep. 2023. - [26] T. Ogawa, H. Kobayashi, M. Hotta, Y. Takahashi, H. San, and N. Takai, "SAR ADC algorithm with redundancy," in *Proc. IEEE Asia–Pacific Conf. Circuits Syst. (APCCAS)*, Nov. 2008, pp. 268–271. - [27] B. Murmann. ADC Performance Survey 1997–2023. Accessed: Apr. 2024. [Online]. Available: https://github.com/bmurmann/ADC-survey Matias Jara (Member, IEEE) received the B.S. and M.S. degrees from the Pontificia Universidad Catolica de Chile, Santiago, Chile, in 2013 and 2016, respectively, and the Ph.D. degree from the University of California at Los Angeles (UCLA), Los Angeles, CA, USA, in 2022, all in electrical engineering. He has been with Broadcom Inc., Irvine, CA, since 2022. His research interests include data converters and analog and mixed-signal integrated circuits for wireline transceivers. He received the top prize of the 2022 SSCS Student Circuit Contest. **Behzad Razavi** (Fellow, IEEE) received the B.S.E.E. degree from the Sharif University of Technology in 1985 and the M.S.E.E. and Ph.D.E.E. degrees from Stanford University, in 1988 and 1992, respectively. He was with AT&T Bell Laboratories and Hewlett-Packard Laboratories until 1996. Since 1996, he has been an Associate Professor and subsequently a Professor of electrical engineering with the University of California at Los Angeles. He was an Adjunct Professor with Princeton University (1992–1994) and Stanford University (1995). He is the author of *Principles of Data Conversion System Design* (IEEE Press, 1995), *RF Microelectronics* (Prentice Hall, 1998, 2012) (translated to Chinese, Japanese, and Korean), *Design of Analog CMOS Integrated Circuits* (McGraw-Hill, 2001 and 2016) (translated to Chinese, Japanese, and Korean), *Design of Integrated Circuits for Optical Communications* (McGraw-Hill, 2003, and Wiley, 2012), *Design of CMOS Phase-Locked Loops* (Cambridge University Press, 2020), and *Fundamentals of Microelectronics* (Wiley, 2006, 2014, and 2021) (translated to Korean, Portuguese, and Turkish); and an Editor of *Monolithic Phase-Locked Loops and Clock Recovery Circuits* (IEEE Press, 1996) and *Phase-Locking in High-Performance Systems* (IEEE Press, 2003). His current research interests include wireless and wireline transceivers and data converters. Prof. Razavi is a member of the U.S. National Academy of Engineering and a fellow of the U.S. National Academy of Inventors. He received the Beatrice Winner Award for Editorial Excellence at the 1994 ISSCC, the Best Paper Award at the 1994 European Solid-State Circuits Conference, the Best Panel Award at the 1995 and 1997 ISSCC, the TRW Innovative Teaching Award in 1997, the Best Paper Award at the IEEE Custom Integrated Circuits Conference in 1998, the McGraw-Hill First Edition of the Year Award in 2001, the Lockheed Martin Excellence in Teaching Award in 2006, the UCLA Faculty Senate Teaching Award in 2007, the CICC Best Invited Paper Award in 2009 and 2012, the 2012 Donald Pederson Award in Solid-State Circuits, the American Society for Engineering Education PSW Teaching Award in 2014, and the 2017 IEEE CAS John Choma Education Award. He was a co-recipient of the Jack Kilby Outstanding Student Paper Award, the Beatrice Winner Award for Editorial Excellence at the 2001 ISSCC, the 2012 and the 2015 VLSI Circuits Symposium Best Student Paper Awards, and the 2013 CICC Best Paper Award. He was also recognized as one of the top ten authors in the 50-year history of ISSCC. He served on the Technical Program Committees of the International Solid-State Circuits Conference (ISSCC) (1993-2002) and VLSI Circuits Symposium (1998-2002). He has served as the Guest Editor and an Associate Editor for IEEE JOURNAL OF SOLID-STATE CIRCUITS, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, and International Journal of High Speed Electronics. He served as the Founding Editor-in-Chief for IEEE SOLID-STATE CIRCUITS LETTERS. He has served as an IEEE Distinguished Lecturer.