

THE ANALOG MIND

# The Design of a Comparator

Nyquist-rate and oversampling analog-to-digital converters (ADCs) incorporate comparators to perform quantization and possibly sampling. Comparators thus have a significant impact on the speed and precision of ADCs. This article presents the step-by-step design of a comparator and the discovery of its various trade-offs.

# **General Considerations**

A comparator senses a differential input and generates a logical output according to the polarity of the input difference. In an ADC environment, we are interested in the following comparator design parameters: input offset, speed, power consumption, metastability, kickback noise, and input-referred electronic noise. The design begins with the selection of target values for some of these parameters. Here, we aim for an input offset lower than 5 mV; a clock rate,  $f_{CK}$ , of 5 GHz; and a power consumption of 1 mW. After the design meets these requirements, we examine the remaining parameters and decide whether they are adequate.

For this article, we selected the StrongArm latch as the comparator core. Readers are referred to [1]–[4] for the properties and operation details of the circuit. Shown in Figure 1, this topology offers several desirable attributes: it requires a single-clock phase; draws no static power; exhibits an input offset that arises primarily from the input pair,  $M_1$ 

Digital Object Identifier 10.1109/MSSC.2020.3021865 Date of current version: 18 November 2020 and  $M_2$ ; and delivers rail-to-rail output swings [4].

A brief overview of the Strong-Arm latch's operation proves helpful here. As explained in [4], the circuit of Figure 1 begins by precharging nodes *P*, *Q*, *X*, and *Y* to  $V_{DD}$ . We denote the capacitances at these nodes by  $C_P, C_Q, C_X$ , and  $C_v$ , respectively, and assume that  $C_P = C_Q$  and  $C_X = C_Y$ . When CK goes high,  $M_1$  and  $M_2$  act as a differential pair with capacitive loads, and  $V_{p}$  and  $V_{Q}$  fall from  $V_{DD}$ while yielding a differential component proportional to  $V_{in1} - V_{in2}$ . This mode continues until  $V_p$  and  $V_0$  drop to roughly  $V_{DD} - V_{TH3,4}$ , creating a voltage gain approximately equal to  $2g_{m1,2}V_{\text{TH3},4}/I_{\text{SS}}$ , where  $g_{m1,2}$  denotes the transconductance of  $M_1$  and  $M_2$ , and  $I_{ss}$  is the tail current [3]. At the end of this mode,  $M_3$  and  $M_4$  turn on, causing  $V_{\chi}$  and  $V_{\gamma}$  to fall until  $M_{5}$ and  $M_6$  are activated. One output is then pulled back to  $V_{DD}$  by  $M_5$  or  $M_6$ while the other falls to zero. As examined in [4], the role of  $M_3$  and  $M_4$ is to cut the current path from  $V_{\rm DD}$  to

the ground after the comparator has made a decision. The circuit's power consumption in the signal path is given by  $2f_{CK}C_PV_{DD}^2 + f_{CK}C_XV_{DD}^2$  [4]. Additionally, the clock path draws a power of  $f_{CK}C_{CK}V_{DD}^2$ , where  $C_{CK}$  is the sum of the gate capacitances of  $M_7$ and the four PMOS switches,  $S_1-S_4$ .

The precharge action in the StrongArm latch offers two benefits. First, it enables  $V_P$  and  $V_O$  in Figure 1 to begin from  $V_{\rm DD}$ , thus keeping  $M_{\rm 1}$ and  $M_2$  in saturation for some time. This allows the input transistors to provide gain. Second, after each comparison, the four internal nodes recover from the states developed on them and are "equalized." This ensures that the states in one clock cycle are not inherited by the next, suppressing "dynamic" offsets. As depicted in Figure 2, if, at the end of the precharge mode,  $V_p$  and  $V_o$  do not become exactly equal and bear a difference of  $\Delta V$ , the subsequent amplification mode begins with such a difference stored on  $C_p$  and  $C_o$ , suffering from offset.



FIGURE 1: The StrongArm latch and its waveforms.

Most of our design effort is expended on selecting the transistor dimensions in Figure 1. We generally begin with near-minimum dimensions unless there is a compelling reason not to do so. Also, our simulations are performed under worstcase process, supply voltage, and temperature (PVT) conditions because the circuit must eventually operate satisfactorily in such a corner. In this spirit, we select the slow-slow corner,  $V_{DD} = 1 \text{ V} - 5\% = 0.95 \text{ V}$ , and  $T = 75^{\circ}$  C. We also assume for the clock a 50% duty cycle and 10-ps rise and fall times. The comparator is designed using 28-nm CMOS technology.

#### **Choice of Device Dimensions**

Comparator design begins with selecting the transistor dimensions so as to meet the offset requirement. In our case, the pairs  $M_1$  and  $M_2$ ,  $M_3$ and  $M_4$ , and  $M_5$  and  $M_6$  in Figure 1 appear in the signal path and must be crafted first. Let us consider  $M_1$  and  $M_2$  and write their threshold voltage mismatch as

$$\Delta V_{\rm TH1,2} = \frac{A_{\rm VTH}}{\sqrt{(WL)_{1,2}}},$$
 (1)

where  $A_{\rm VTH}$  is a constant [5] and roughly 2.2 mV  $\mu$ m in 28-nm technology. If we choose  $W_{1,2} = 10 \,\mu$ m and an effective length of 25 nm, then  $\Delta V_{\rm TH} = 4.4$  mV. This appears to be a reasonable starting point provided that the other pairs' contributions do not raise the offset beyond the 5-mV target.

We should remark that (1) gives the standard deviation,  $\sigma$ , of the mismatch; i.e., approximately 68% of the differential pairs in a Gaussian distribution exhibit offsets less than this amount. In practice, we seek higher yields and must either enlarge the transistors or incorporate offset cancellation.

The tail transistor  $M_7$  in Figure 1 must draw sufficient current with  $V_{\rm GS7} = V_{\rm DD}$  and  $V_{\rm DS7} = V_{\rm in,CM} - V_{\rm GS1,2}$ , where  $V_{\rm in,CM}$  denotes the input common-mode (CM) level. With  $V_{\rm in,CM} = 0.5$  V and  $V_{GS1,2} \approx 0.35$  V, we have  $V_{DS7} \approx 0.15$  V. The device thus operates in the deep triode region. Let us select  $W_7 = 2 \mu m$  for a current of roughly 0.5 mA.

Given that the circuit provides gain before  $M_3$  and  $M_4$  turn on, we expect that the offset of this pair is reduced when referred to the main input. The reduction factor is, in fact, greater than the value of  $2g_{m1,2}V_{TH3,4}/I_{SS}$ mentioned previously. To understand why, suppose  $M_3$  and  $M_4$  are on (Figure 3) and neglect the capacitances at nodes P and Q. Thus,  $I_{D1}$  and  $I_{D2}$  entirely flow through  $M_3$  and  $M_4$ , respectively, as if these transistors were absent. The offset contributed by this pair is therefore negligible unless the circuit's capacitances are taken into account. As discussed later in this section, the threshold mismatch between  $M_3$  and  $M_4$  is divided by a factor of 3-5 in typical designs. We select  $W_{3,4} = 10 \,\mu\text{m}$  for now, expecting that this choice only slightly raises the input offset.

The PMOS cross-coupled pair in Figure 1 turns on after  $V_x$  and  $V_y$  fall by one 1-PMOS threshold. Before this time, the circuit provides a high voltage gain, thereby reducing this pair's offset contribution considerably. In this respect, we surmise that a width of a few microns suffices for  $M_5$  and  $M_6$ , but we must bear in mind that these devices also amplify regeneratively and play a role in the comparator's speed. We return to this point when we optimize the design.

The reset switches  $S_1 - S_4$  in Figure 1 must pull their drain nodes to  $V_{\text{DD}}$  in under 100 ps. We predict that

a width of 0.5–1  $\mu$ m can meet this constraint.

# **Basic Waveforms**

Based on our foregoing thoughts, we construct the comparator shown in Figure 4 and simulate it in the time domain. The output inverters act as buffers and employ relatively small transistors for now. Before optimizing the design, we familiarize ourselves with the circuit's waveforms.



FIGURE 2: An example of dynamic offset.



**FIGURE 3:** The effect of the mismatch between  $M_3$  and  $M_4$  in the absence of capacitances at *P* and *Q*.



FIGURE 4: The initial design of the comparator core.



FIGURE 5: The comparator's (a) voltage waveforms and (b) tail current waveform.

The speed of the comparator depends on the input voltage difference, ultimately requiring a metastability analysis (as explained later). However, it is common in ADC design to select this difference to be half of the least-significant bit, which, in view of our tolerable offset, would be 10-20 mV for this design. However, we apply a difference of 1 mV so as to place the circuit in "slow motion" and examine its operation details. Plotted in Figure 5(a) are the voltages at nodes P, Q, X, and Y. The clock rises from zero to  $V_{\text{DD}}$  between t = 300 and t = 310 ps. Note that  $V_x$  and  $V_y$  experience a CM drop of approximately 400 mV before they begin to depart as a result of the regeneration provided by  $M_5$ and  $M_6$ . We also observe from Figure 5(b) that the tail current reaches a peak of roughly 800  $\mu$ A before  $V_p$ and  $V_Q$  drop enough to drive the input transistors into the triode region and cause the tail node voltage to collapse.

Figure 6 plots  $V_P - V_Q$  for  $V_{in1} - V_{in2} = 1 \text{ mV}$ . Two observations prove important here. First, as shown in the inset,  $V_P - V_Q$  reaches -3.15 mV at t = 315 ps, the greatest difference before  $M_3$  and  $M_4$  turn on. That is, the initial voltage gain is equal to 3.15. Second,  $V_P - V_Q$  is less than 100  $\mu$ V at t = 500 ps, i.e., before the next clock



**FIGURE 6:** The difference between  $V_p$  and  $V_o$  as a function of time.

cycle. Thus, the precharge devices are strong enough, and the dynamic offset is negligible.

For design optimization, we need a metric for the circuit's speed. For example, we can find the time it takes for  $|V_X - V_Y|$  to reach a certain amount, say, 200 mV. This time is measured with respect to when the clock's rising edge crosses  $V_{DD}/2$ , and is equal to 36 ps in Figure 5(a).

#### **Offset and Speed Optimization**

For the design in Figure 4, we must quantify the input offset contributed by both the  $M_3$  and  $M_4$  pair and the  $M_5$  and  $M_6$  pair. To this end, we place a voltage source equal to  $\Delta V_{\text{TH3,4}} = 4.4 \text{ mV}$  in series with the gate of  $M_3$  while the other pairs remain matched [Figure 7(a)]. We then adjust the input voltage difference so that the circuit is nearly balanced and  $V_X - V_Y$  tends to stay near zero for a relatively long time. With some iteration, we find that  $V_{\rm in1} - V_{\rm in2} \approx 1.15 \text{ mV}$  leads to such a behavior [see Figure 7(b)]. This suggests that the offset of  $M_3$  and  $M_4$  is divided by a factor of 4.4/1.15 = 3.8when referred to the input. The offset standard deviation arising from both the  $M_1$  and  $M_2$  pair and the  $M_3$  and  $M_4$  pair is thus given by  $\sqrt{(4.4 \text{ mV})^2 + (1.15 \text{ mV})^2} = 4.5 \text{ mV}.$ 

Given the small offset contribution of  $M_3$  and  $M_4$ , we ask whether their widths can be reduced so as to increase the speed. Indeed, if  $W_{3,4} = 5 \mu m$ , then  $|V_X - V_Y|$  reaches 200 mV in 28 ps. A test similar to that in Figure 7(a) with  $\Delta V_{\text{TH}3,4} = A_{\text{VTH}}/\sqrt{WL} = 6.2 \text{ mV}$  indicates an input contribution of 1.5 mV. That is, the offset rises from 4.5 mV to  $\sqrt{(4.4 \text{ mV})^2 + (1.5 \text{ mV})^2} = 4.6 \text{ mV}$ . The small increase in the offset makes  $W_{3,4} = 5 \ \mu \text{m}$  a more favorable choice.

In the next step of optimization, we turn to  $M_{s}$  and  $M_{6}$  in Figure 4 and quantify their offset contribution. With  $W_{5,6} = 2.5 \,\mu\text{m}$ , we have  $\Delta V_{\text{TH5,6}} = 8.8 \text{ mV}$ . Inserting this voltage in series with the gate of  $M_{\rm s}$  or  $M_{\rm s}$ and repeating the procedure of Figure 7(a), we arrive at an input contribution equal to 0.9 mV. The total input offset is  $\sqrt{(4.4 \text{ mV})^2 + (1.5 \text{ mV})^2 + (0.9 \text{ mV})^2} =$ 4.7 mV. If we double the widths of  $M_5$  and  $M_6$ , their input contribution is still roughly 0.9 mV because their larger capacitances lower the voltage gain developed by the circuit before this pair turns on. We therefore retain  $W_{5,6} = 2.5 \,\mu \text{m}$ . The offset calculations can be verified through the use of Monte Carlo simulations that incorporate the foundry's mismatch models.

It is possible to increase the comparator's speed by raising the tail current, i.e., by widening  $M_7$  in Figure 4. Plotted in Figure 8 are  $V_x$  and  $V_y$  for  $W_7 = 2$  and 4  $\mu$ m, respectively. The time necessary for  $|V_X - V_Y|$  to reach 200 mV drops from 28 to 22 ps. Interestingly, the power consumed in the signal path,  $2f_{CK}C_PV_{DD}^2 + f_{CK}C_XV_{DD}^2$ , remains fairly constant, but the clock path draws greater power. As explained later, a wider  $M_7$  translates to higher kickback noise.

#### Addition of the Reset-Set Latch

In the precharge mode, the Strong-Arm comparator's decision is erased, and the outputs do not represent a valid logical level, potentially confusing the following stages. To resolve this issue, we insert a reset-set (RS) latch in the output path. As illustrated in Figure 9, the RS latch can change its state only if  $M_{11}$  or  $M_{12}$  turns on, i.e., when  $V_x$  or  $V_y$  falls to zero. This latch then retains the state as the StrongArm circuit enters the precharge mode.



**FIGURE 7:** (a) The inclusion of the mismatch between  $M_3$  and  $M_4$  and (b) the output waveforms when the circuit is nearly balanced.



**FIGURE 8:** The comparator output waveforms for  $W_r = 2 \mu m$  and  $4 \mu m$ , respectively.







FIGURE 10: The RS latch outputs for the original and modified designs.



FIGURE 11: (a) Comparator metastability and (b) the behavior of PMOS transistors during this period.

The RS latch's delay proves critical in some ADC architectures and must be minimized. Since the latch operates by pulling one output low, we surmise that  $M_{11}$  and  $M_{12}$  must be relatively wide. Moreover, if we increase the width of the inverters' PMOS devices, their outputs rise with less delay. We then change  $W_{11,12}$  to 800 nm and also the inverters' PMOS widths to 800 nm. As shown in Figure 10, the total delay now drops by 6 ps. The overall comparator circuit draws approximately 0.2 mW at 5 GHz.

#### Metastability

In an ADC environment, a comparator senses a random signal at the moment it is clocked. Thus, its input difference can be arbitrarily small. For example, if the signal has a peakto-peak swing of  $2A_0$  with a uniform distribution, then the probability that the difference (positive or negative) presented to the comparator is less than  $\Delta V$  is equal to  $\Delta V/A_0$ .

Upon sensing a small difference, a comparator takes some time to generate a well-defined logical output. If it cannot do so in half of the



**FIGURE 12:** A shift of regeneration response for different initial conditions.

clock period, we say the circuit is "metastable" [Figure 11(a)]. During metastability, the indefinite nature of the comparator outputs can propagate to the subsequent logic, introducing large errors. This issue becomes particularly serious in digital communication systems where the error rate must be extremely low, e.g.,  $<10^{-14}$ .

In the StrongArm latch of Figure 4,  $M_5$  and  $M_6$  serve as the primary amplifying circuit during a metastable state. From the model shown in Figure 11(b), we can prove that the positive feedback around the loop is characterized by

$$V_{XY} = V_{XY0} \exp \frac{t}{\tau_{\rm reg}},\tag{2}$$

where  $V_{XY0}$  denotes the initial value, and the regeneration time constant,  $\tau_{reg}$ , is given by  $g_{m5,6}/C_X$ . Note that this expression is valid only after the initial fall of  $V_x$  and  $V_y$ . If  $V_{XY}$  is not large enough after  $T_{CK}/2$  seconds to write a well-defined state onto the RS latch, an error can occur. Denoting the minimum acceptable value of  $V_{XY}$  by  $V_y$ , we require that

$$V_{XYO} \ge V_1 \exp \frac{-T_{\rm CK}/2}{\tau_{\rm reg}}.$$
 (3)

Input differences that yield a  $V_{XYO}$  less than this value cause a metastability error. The error rate is therefore proportional to  $\exp[-T_{CK}/(2\tau_{reg})]$ , underscoring the high impact of  $\tau_{reg}$ .

We can compute  $\tau_{reg}$  by measuring  $g_{m5,6}$  and  $C_x$  in Figure 4. Alternatively, we can obtain  $\tau_{reg}$  directly from simulations. Consider the two regeneration scenarios depicted in Figure 12, where  $V_{XY0}$  is chosen equal to some amount,  $\Delta V$ , or a smaller amount,  $\Delta V/\alpha$ . We denote  $V_X - V_Y$  for the two cases by  $V_{XY1}$  and  $V_{XY2}$ , respectively, and write from (2)

$$V_{XY2} = \frac{\Delta V}{\alpha} \exp \frac{t}{\tau_{\text{reg}}}$$
(4)

$$=\Delta V \exp\left(-\ln\alpha\right) \exp\frac{t}{\tau_{\rm reg}}$$
 (5)

$$=\Delta V \exp \frac{t - \tau_{\rm reg} \ln \alpha}{\tau_{\rm reg}}.$$
 (6)

Interestingly,  $V_{XY2}$  is simply equal to  $V_{XY1}$  but shifted by  $\tau_{reg} \ln \alpha$ . For the design in Figure 4, we select an input difference of 1 mV, 100  $\mu$ V, and 10  $\mu$ V, thus arriving at the waveforms shown in Figure 13. The time shift in each case is 5.7 ps and implies that  $\tau_{\rm reg} = 5.7$  ps/ln 10 = 2.5 ps.

Can we reduce  $\tau_{reg}$  by adjusting the widths of  $M_5$  and  $M_6$  in Figure 4? If  $W_{5,6}$  is doubled, these two devices' capacitances double, but their transconductance rises by roughly a factor of  $\sqrt{2}$ . That is,  $\tau_{reg}$ decreases only if  $M_5$  and  $M_6$  do not dominate the capacitance at X and Y. In our design, changing  $W_{5,6}$  from 2.5 to 5  $\mu$ m increases  $\tau_{reg}$  slightly.

The foregoing observations prescribe a simple method for estimating the input difference  $V_{XY0}$  in (3), which leads to an error. We first simulate the comparator with a moderate value for  $V_{in1} - V_{in2}$ , e.g., 1 mV, and find the delay, e.g., 22 ps. We also recognize that 1) reducing  $V_{in1} - V_{in2}$  by a factor of 10<sup>*n*</sup> shifts the response by  $n\tau_{reg} \ln 10$  and 2) if the shift exceeds  $T_{CK}/2 - 22$  ps, an error is likely to occur. In our design,

$$n\tau_{\rm reg} \ln 10 \approx \frac{T_{\rm CK}}{2} - 22 \,\mathrm{ps},$$
 (7)  
 $\approx 78 \,\mathrm{ps}$  (8)

and, hence,  $n \approx 13.5$ . It follows that input differences of <1 mV/10<sup>13.5</sup> may generate errors.

We should remark that simulating a comparator with very small input differences, e.g., 1 fV, requires minimizing all the sources of asymmetry in the circuit and in the simulation tool. Specifically, the presence of the RS latch in Figure 9 does lead to a slight asymmetry in the StrongArm circuit. Suppose the stored state is  $V_A = 0$  and  $V_B = V_{DD}$ . As a result, the gate input capacitances of  $M_{11}$  and  $M_{12}$  are slightly different.



FIGURE 13: The output waveforms of the comparator for input differences equal to 1 mV, 100  $\mu$ V, and 10  $\mu$ V.

This means that the capacitances seen at the inverters' inputs are also slightly unequal (due to the Miller effect of their gate-drain parasitics). This deterministic imbalance causes the Strong-Arm latch to favor one logical output for very small input differences. We therefore disconnect the RS latch for such simulations. Alternatively, we can short *A* and *B* to  $V_{\rm DD}$  so as to maintain the loading presented to the inverters.

Another metastability simulation issue relates to the simulator's accuracies. In Cadence, we set three parameter as follows: reltol =  $10^{-6}$ , vabstol =  $10^{-6}$ , and iabstol =  $10^{-12}$ .

# **Input-Referred Noise**

The standard method of computing the output noise and dividing it by the gain does not apply to comparators because they produce a digital output. As explained in [4], we perform a transient noise simulation so that the comparator's time-domain decision is randomly affected by the noise of its constituent devices. We first set  $V_{in1} - V_{in2}$  to zero [Figure 14(a)] and ensure that the logical output assumes a value of zero or one with equal probabilities. Plotted in Figure 15(a) are  $V_x$  and  $V_y$ , in this case, for 100 clock cycles. We observe that  $V_x$  goes to zero approximately 50 times.

Next, we select a small, constant value for  $|V_{in1} - V_{in2}|$  so as to skew the decisions [see Figure 14(b)]. We recall that the area under a Gaussian distribution from  $-\sigma$  to  $+\sigma$  is equal to 68% and hence that from  $-\infty$  to  $-\sigma$  is 100% – (34% + 50%) = 16%. Thus, if  $V_s$  is chosen so as to reduce the probability of zeros to 16%, then  $V_s = \sigma$ , which is also the total root-meansquare (rms) noise referred to the input. After a few iterations, we observe the waveforms in Figure 15(b), where  $V_x$  goes to zero roughly 16 times for



FIGURE 14: (a) A perfectly balanced comparator generates ones and zeros with equal probabilities; (b) a finite input imbalance skews the decisions.



FIGURE 15: The comparator outputs (a) for perfect balance and (b) with an input difference equal to 0.31 mV.



FIGURE 16: The input kickback noise currents of a StrongArm comparator.

 $V_S = 0.31 \,\mathrm{mV}$ . The comparator's input rms noise is approximately equal to this value. For greater precision, we can run the two simulations for a larger number of clock cycles.

### **Kickback Noise**

The StrongArm latch draws large transient currents from its inputs during switching. Called the *kickback noise*, this phenomenon proves undesirable if it affects the comparator's own decision or corrupts the input voltage while it is sensed by other circuits. For example, in a flash ADC, all of the comparators generate kickback noise while one must make a critical decision. Figure 16 plots the kickback noise currents of our design when  $V_{in1} - V_{in2} = 1 \text{ mV}$ . The clock begins to rise at t = 300 ps. We recognize that the two exhibit both CM and differential components. The former are objectionable if they flow through unequal source impedances, and the latter prove problematic generally. Kickback noise trades with the dimensions of the input transistors and hence with the offset voltage. But the timing of this noise determines whether it has an adverse effect on the performance. For example, we expect the noise around t = 300 ps in Figure 16 to be more serious, as it coincides with the comparator's decision time.

#### References

- J. Montanaro et al., "A 160-MHz 32-b 0.5-W CMOS RISC microprocessor," *IEEE J. Solid-State Circuits*, vol. 31, no. 11, pp. 1703–1714, Nov. 1996. doi: 10.1109/ JSSC.1996.542315.
- [2] T. Kobayashi, K. Nogami, T. Shirotori, Y. Fujimoto, and O. Watanabe, "A currentmode latch sense amplifier and a static power saving input buffer for low-power architecture," in *Proc. VLSI Circuits Symp. Dig. Tech. Papers*, June 1992, pp. 28–29. doi: 10.1109/VLSIC.1992.229252.
- [3] P. Nuzzo, F. De Bernardinis, P. Terreni, and G. Van der Plas, "Noise analysis of regenerative comparators for reconfigurable ADC architectures," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 55, no. 6, pp. 1441–1454, July 2008. doi: 10.1109/ TCSI.2008.917991.
- [4] B. Razavi, "The StrongARM Latch [a circuit for all seasons]," *IEEE Solid State Circuits Mag.*, vol. 7, no. 2, pp. 12–17, Spring 2015. doi: 10.1109/MSSC.2015.2418155.
- [5] M. J. M. Pelgrom, A. C. J. Duinmaijer, and A. P. G. Welbers, "Matching properties of MOS transistors," *IEEE J. Solid-State Circuits*, vol. 24, no. 5, pp. 1433–1440, Oct. 1989. doi: 10.1109/JSSC.1989.572629.