# A 20-GHz PLL With 20.9-fs Random Jitter

Yu Zhao<sup>(D)</sup>, Member, IEEE, Mahdi Forghani<sup>(D)</sup>, Graduate Student Member, IEEE,

and Behzad Razavi<sup>D</sup>, *Fellow*, *IEEE* 

Abstract—This article describes an integer-N phase-locked loop (PLL) that incorporates a phase detector sampling both the rising and falling edges of the reference clock. The circuit also uses a new retiming method in the feedback divider. Optimized for the reference and oscillator phase noise and fabricated in the 28-nm CMOS technology, the experimental prototype achieves an rms jitter of 20.9 fs integrated from 10 kHz to 40 MHz with a spur level of -66 dBc while consuming 12 mW of power.

*Index Terms*— Crystal oscillator, double-sampling phase detector (DSPD), master–slave sampling, modular divider, phase noise, voltage-controlled oscillator (VCO).

### I. INTRODUCTION

**C** OMMUNICATION and signal processing applications continue to pose challenging requirements on phase-locked loops (PLLs) in terms of speed, power consumption, and jitter. Observed in both the wireless and wireline systems, this trend arises primarily because of the need for higher data rates. For example, a 112-Gb/s PAM4 wireline receiver using a 7-bit 56-GHz analog-to-digital converter (ADC) incurs 3 dB of the signal-to-noise ratio penalty at the Nyquist rate if the clock jitter exceeds 36 fs<sub>rms</sub>. While, in practice, the ADC is realized as a number of time-interleaved channels running at lower clock frequencies, this jitter bound still governs the generation of the clocks. Moreover, 12-bit ADCs designed for direct RF sampling [1] face similar jitter constraints as they approach a rate of 20 GHz.

Recent work has demonstrated jitter values below 100 fs<sub>rms</sub> at frequencies ranging from 7 to 31 GHz [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12]. These examples have introduced a number of design paradigms. Some are based on subsampling [13], [14], [15], [16], [17] as it obviates the need for a feedback frequency divider, a phase/frequency detector (PFD), and a charge pump (CP). The jitter associated with these functions is thus eliminated. Moreover, by directly sampling the fast edges produced by the voltage-controlled oscillator (VCO), this method achieves a high phase detector gain and

Manuscript received 25 April 2022; revised 21 August 2022 and 8 October 2022; accepted 14 November 2022. Date of publication 13 December 2022; date of current version 26 May 2023. This article was approved by Associate Editor Yusuke Oike. This work was supported by Realtek Semiconductor. (*Corresponding author: Yu Zhao.*)

Yu Zhao was with the Department of Electrical and Computer Engineering, University of California at Los Angeles, Los Angeles, CA 90095 USA. He is now with HiSilicon, Shanghai 201206, China (e-mail: zhaoyu@ucla.edu).

Mahdi Forghani and Behzad Razavi are with the Department of Electrical and Computer Engineering, University of California at Los Angles, Los Angeles, CA 90095 USA.

Color versions of one or more figures in this article are available at https://doi.org/10.1109/JSSC.2022.3225105.

Digital Object Identifier 10.1109/JSSC.2022.3225105

hence a low PD contribution to phase noise. Subsampling techniques have also been applied to digital PLLs to achieve 76  $fs_{rms}$  [11] and 47.3  $fs_{rms}$  [9].

Another low-jitter topology uses injection locking [18], [19], [20], [21]. A copy of the reference is injected into the VCO so as to suppress the latter's phase noise. However, the periodic disturbance of the VCO leads to relatively large reference spurs.

This article presents the design and implementation of a 20-GHz integer-*N* PLL that incorporates a number of new techniques to achieve a jitter of 20.9 fs<sub>rms</sub> [22]. Realized in the 28-nm CMOS technology, the prototype exhibits a reference spur level of -66 dBc. While PLLs in general must cover a fairly wide bandwidth, this work targets a design driving a 20-GHz ADC such as that in [1]. That is, the PLL synthesizes only one output frequency.

Section II provides the background for this work and presents the optimization of the loop in terms of the reference and oscillator phase noise. Section III describes the proposed PLL architecture and deals with the design of its building blocks. Section IV is concerned with the experimental results.

## **II. GENERAL CONSIDERATIONS**

As we seek jitter values in the range of a few tens of femtoseconds, the contribution of all the noise sources becomes significant. We first quantify these contributions and then decide which ones can be avoided. Given our target jitter of 20 fs<sub>rms</sub> and the numerous contributors in a typical design, we also explore the possibility of jitter values less than 5 fs for some of the functions. In other words, we wish to make the VCO and the reference dominate the overall phase noise.

## A. Reference and VCO Phase Noise

The phase noise of crystal oscillators,  $S_{REF}$ , has become increasingly more critical as sub-100-fs jitter values have been targeted. In the ideal case, only the reference and the VCO contribute jitter. It can be shown that if flicker noise is neglected and the free-running VCO phase noise is expressed as  $\alpha/f^2$ , then the optimum loop bandwidth is given by

$$f_{0,\text{opt}} = \sqrt{\frac{4\alpha}{\pi N^2 S_{\text{REF}}}} \tag{1}$$

where N is the ratio of the PLL output frequency to the reference frequency. The minimum PLL output integrated phase noise is thus equal to

$$S_{\rm tot,min} = 4\sqrt{\alpha\pi N^2 S_{\rm REF}}.$$
 (2)

0018-9200 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.



Fig. 1. SRout/SRin ratio versus reference frequency.

It can be proved that the optimum PLL bandwidth yields

$$\sigma_{j} = \frac{\sqrt{S_{\text{tot,min}}}}{2\pi} \frac{T_{\text{REF}}}{N}$$
$$= \sqrt[4]{\frac{\alpha S_{\text{REF}}}{\pi^{3} N^{2}}} \frac{1}{f_{\text{REF}}}.$$
(3)

With  $f_{\text{REF}} = 250 \text{ MHz}$  and  $S_{\text{REF}} = -170 \text{ dBc/Hz}$ , we must have a loop bandwidth of 10 MHz. Note that these results apply to subsampling PLLs as well.

#### B. Reference Buffer Phase Noise

Stand-alone low-noise crystal oscillators typically provide a nearly sinusoidal output. For example, Crystek's CRBSCS-01-250, used in our measurements, exhibits harmonics that are at least 20 dB below the fundamental. This waveform must be sharpened by an on-chip inverter before reaching the PLL, thereby suffering from additional phase noise. The resulting phase noise adds to that of the crystal oscillator and must be included in the bandwidth optimization described above. The principal issues here are that owing to the slow input transitions: 1) the inverter transistors inject noise over a long time window and 2) both the devices produce noise on each output edge.

For a sinusoidal input, the output slew rate ( $SR_{out}$ ) strongly depends on the input slew rate ( $SR_{in}$ ). As an approximation, we can say that the two differ by a factor equal to the inverter's small-signal voltage gain,  $A_v$ . At sufficiently high frequencies, however, the output slew rate is also limited by the output current and the load capacitance. We thus expect the general behavior depicted in Fig. 1. For the reference buffer (RBUF) design in our work, we note that  $SR_{out}/SR_{in} \approx 9$  at 250 MHz.<sup>1</sup>

The phase noise of an inverter due to the transistors' white noise is derived in [23] for an input with a period of  $T_{in}$  and expressed as

$$S_{\phi}(f) = \frac{\pi^2}{r_{\text{edge}}^2 C_L^2} \frac{\Delta T}{T_{\text{in}}} \left[ S_{I,N}(f) + S_{I,P}(f) \right]$$
(4)

where  $r_{edge}$  is the output slew rate (also denoted by SR<sub>out</sub> in this article),  $C_L$  the load capacitance,  $\Delta T$  the noise window shown in Fig. 2(a), and  $S_{I,N}(f)$  and  $S_{I,P}(f)$  are the noise current spectra of the NMOS and PMOS devices, respectively.<sup>2</sup> This result is derived for relatively fast input edges and assumes that only the NMOS device corrupts the falling edge and only the PMOS device, the rising edge.



Fig. 2. (a) CMOS inverter input–output waveforms during sharp transitions and (b) noise window of NMOS device in RBUF with a sinusoidal input.

Equation (4) can be extended to the case of a sinusoidal input as follows. We consider the input and output waveforms shown in Fig. 2(b), noting that the NMOS transistor enters saturation at  $t_1$ . We also assume  $t_1$  to be the starting point of the PMOS noise window because the noise injected onto  $C_L$  before  $t_1$  is discharged by the triode NMOS device. This point is verified by transient simulations in Cadence's Spectre. Another simplifying assumption is that the noise injected by the transistors after  $t_{mid}$  is unimportant to the output phase noise [23]. We conclude that for both the transistors, the noise window,  $\Delta T$ , is from  $t_1$  to  $t_{mid}$ , which is approximately half of the rise time. The overall output phase noise then emerges as

$$S_{\phi}(f) = \frac{2\pi^2}{r_{\text{edge}}^2 C_L^2} \frac{\Delta T}{T_{\text{in}}} \Big[ S_{I,N}(f) + S_{I,P}(f) \Big]$$
(5)

where we assume equal output rise and fall times and hence the same  $\Delta T$  for the two edges. The factor of 2 accounts for the phase corruption on each edge due to both the devices.

The dependence of the RBUF phase noise upon the input frequency is of interest but is made more complex by the behavior depicted in Fig. 1. In this particular design, the buffer's phase noise decreases by about 1.4 dB if  $f_{\text{REF}}$  rises from 40 to 80 MHz. This is because SR<sub>out</sub> in Fig. 1 increases by only a factor of 1.4 and  $\Delta T$  decreases by a factor of 1.4 in (5). With  $T_{\text{in}}$  halved, the right-hand side of (5) drops by a factor of  $(1.4)^3/2 \equiv 1.4$  dB. Plotted in Fig. 3 are the simulated phase noise profiles of our buffer for  $f_{\text{REF}} = 40$ , 80, 160, and 250 MHz. The key point here is that the buffer's integrated jitter falls as  $f_{\text{REF}}$  rises. In Fig. 3, the corresponding rms jitter values are equal to 79.2, 33.7, 14.3, and 8.5 fs. The phase noise below 100-kHz offset is dominated by the flicker noise and worsens with the increased reference frequency. We will explain this point in Section III-B.

<sup>&</sup>lt;sup>1</sup>The RBUF in our work has a low-frequency gain  $(A_v)$  of 40.

<sup>&</sup>lt;sup>2</sup>These spectra are measured with  $|V_{GS}| = V_{DD}$  and  $|V_{DS}| = V_{DD}/2$ .



Fig. 3. Phase noise of RBUF at different input frequencies.

Besides using higher reference frequencies, the noise power trade-off of RBUF can also be exploited to reduce its jitter contribution. If the inverter's output capacitance is much greater than the input capacitance of the next stage, every doubling of the transistor widths lowers the phase noise by 3 dB. This can be seen from (5), where  $S_{I,N}(f)$ ,  $S_{I,P}(f)$ , and  $C_L$  are doubled while other quantities remain unchanged. In this work, the NMOS and PMOS aspect ratios are 1120  $\mu$ m/400 nm and 1600  $\mu$ m/400 nm, respectively, leading to a power consumption of 1.3 mW at 250 MHz and the phase noise profile as shown in Fig. 3. With such large dimensions, the buffer still contributes significant jitter, underscoring the future challenges that we will face as we seek smaller jitter values.

The last issue related to RBUF is its supply sensitivity,  $K_{DD}$ . Typically fed from an on-chip low-dropout (LDO) regulator, RBUF converts the LDO noise into phase noise. For the inverter design described above,  $K_{DD} = 764 \text{ ps/V}$ . To maintain the supply-induced phase noise about 10 dB below the profile shown in Fig. 3, the LDO noise spectrum must be less than 0.5 nV/(Hz)<sup>1/2</sup>, an extremely stringent constraint. For example, an LDO op amp using a differential pair with ideal exponential transistors would require a tail current of at least 3.4 mA to achieve this noise level. As explained in Section III-B, our proposed phase detector relaxes this issue by orders of magnitude.

## C. Phase/Frequency Detector Phase Noise

The phase noise of PFDs has been analyzed in [23], with the conclusion that true single-phase clocking (TSPC) implementations are advantageous. Fig. 4(a) depicts an example optimized according to [23], and Fig. 4(b) shows plots of the circuit's simulated phase noise at 250 MHz. Consuming  $60 \mu$ W, the PFD generates an rms jitter of 9.4 fs. For this value to fall below, for example, 5 fs, one would need to multiply the transistor widths by a factor of 3.5.<sup>3</sup> The PFD therefore does not appear to be a serious bottleneck.

## D. Charge Pump Noise

The thermal and flicker noise of the up and down current sources in a CP corrupt the current delivered to the loop filter,

 $^{3}$ Every doubling of the transistor widths in the PFD reduces the jitter by a factor of  $(2)^{1/2}$ .



Fig. 4. (a) Optimized TSPC PFD and (b) phase noise of TSPC PFD.

equivalently generating phase noise. It can be shown that the CP thermal noise referred to the PFD input leads to

$$S_{\rm CP}(f) = 8\pi^2 \frac{T_{\rm CP}}{T_{\rm REF}} \frac{\overline{I_n^2}}{I_p^2} \tag{6}$$

where  $T_{\rm CP}$  denotes the minimum PFD output pulsewidth,  $\overline{I_n^2}$  the thermal noise spectrum of each current source, and  $I_P$  the nominal CP current. Neglecting the CP flicker noise and considering typical values for the parameters in (6), we can readily appreciate the difficulties. Suppose we wish the CP contribution in a PLL bandwidth of 10 MHz to be less than 5 fs. It can be shown that the CP produces an rms jitter given by  $(\pi f_0 S_{\rm CP})^{1/2} T_{\rm REF}/2\pi$ , and hence

$$\frac{\sqrt{\pi f_0 S_{\rm CP}}}{2\pi} T_{\rm REF} < \mathbf{5} \, \rm{fs.} \tag{7}$$

It follows that  $S_{CP} = -\frac{177}{IREF}$  dBc/Hz if  $T_{REF} = 4$  ns. Returning to (6) and assuming: 1)  $\overline{I_n^2} = 2kT\gamma g_m = 2kT\gamma (2I_P)/|V_{GS} - V_{TH}|$ ; 2)  $|V_{GS} - V_{TH}| = 200$  mV; and (3)  $T_{CP} = 50$  ps, we obtain  $I_P = 110$  mA. Note that the CP output range is 600 mV.

The foregoing observations suggest that CPs prove ill-suited to low-jitter PLLs.

## III. PROPOSED PLL

The proposed PLL architecture is shown in Fig. 5. It consists of an RBUF, a double-sampling PD (DSPD), a transconductor,



Fig. 5. Proposed PLL architecture.



Fig. 6. (a) Single-sampling PD and (b) its time-domain waveforms.

a loop filter, a VCO followed by a  $\div$ 2 stage, and a multimodulus "self-retimed" divider that controls the PD through a nonoverlap generator.

We wish to make negligible the jitter arising from the PD, the Gm stage, and the divider. If successful, such an endeavor allows us to apply the optimization described in Section II-A.

## A. Double-Sampling PD

The PD proposed here plays a central role in PLL's performance. Before describing this topology, we consider the (single) master–slave sampling PD introduced in [24] and [25] as shown in Fig. 6(a). The circuit adjusts the PLL feedback signal,  $\phi_1$ , such that the sampled value of  $V_R$  ( $V_a$ ) becomes equal to the control voltage necessary for the VCO [see Fig. 6(b)]. Next,  $\phi_2$  and  $C_2$  resample this level, creating minimal perturbation on  $V_{\text{cont.}}^4$ 

Owing to the high slew rate of  $V_R$ , the master-slave sampling PD exhibits a high gain, thereby minimizing the noise contributed by the switched capacitors and any other components preceding the VCO. If the slew rate of  $V_R$  in

<sup>4</sup>The PD can directly sample the reference sinusoid without a buffer [26] or with a buffer following the sampler [27], but the lower PD gain makes the kT/C noise more significant.

Fig. 6(b) is denoted by  $SR_R$ , this PD's gain emerges as

$$K_{\rm PD} = \frac{{\rm SR}_R}{2\pi \cdot f_{\rm REF}}.$$
(8)

We now turn to the proposed DSPD shown in Fig. 7(a). Assuming for now that  $V_R$  has a 50% duty cycle, we note that  $C_1$  and  $C_2$  sample  $V_a$  and  $V_b$ , respectively, such that  $V_a - V_b$  translates to the necessary control voltage for the VCO. The double-sampling action not only provides higher gain than single-sampling but also offers new benefits. We elaborate on these points below.

Double-sampling increases the PD gain by a factor of 2. This is seen by noting that in Fig. 7(b), a phase displacement of  $\Delta t$  in  $\phi_1$  shifts both A and B to the right or to the left, changing  $V_a$  and  $V_b$  in *opposite* directions. Thus,

$$K_{\rm PD} = \frac{\mathrm{SR}_R}{\pi \cdot f_{\rm REF}}.$$
(9)

As a result, the kT/C noise components associated with the four switches in Fig. 7(a) are divided by another factor of 4 when referred to the PD input (see Section III-C), providing a 3-dB reduction in PD's phase noise. For  $C_1 = C_2 = 100$  fF and  $C_3 = C_4 = 40$  fF, simulations yield the phase noise profiles shown in Fig. 7(c) at 250 MHz. The integrated jitter drops from 2.9 to 2.1 fs.

## B. Reference Phase Noise Reduction

The most remarkable advantage of double-sampling arises from its ability to reduce the jitter contributed by the crystal oscillator and the RBUF. We present this property for three sources of phase noise, namely, thermal noise, supply noise, and flicker noise.

Illustrated in Fig. 8(a), this PD attribute can be understood by assuming that the rising edge of  $V_R$  is displaced by a random amount,  $\Delta t_1$ . Consequently, the sampled voltage inherited by  $V_3$  in Fig. 7(a) changes by

$$\Delta V_3 = \Delta t_1 \cdot \mathrm{SR}_R. \tag{10}$$

if charge sharing between  $C_1$  and  $C_3$  is neglected.

Similarly, a displacement of  $\Delta t_2$  in the falling edge translates to a change in

$$\Delta V_4 = \Delta t_2 \cdot \mathrm{SR}_R \tag{11}$$

in  $V_4$ . These random changes are combined by the differentialto-single-ended converter shown in Fig. 7(a). If  $V_R$  carries white phase noise and hence  $\Delta t_1$  and  $\Delta t_2$  are uncorrelated, the differential output noise of the DSPD is given by

$$\overline{V_{n,\text{out}}^2} = \mathrm{SR}_R^2 \cdot \left(\sigma_{\Delta t_1}^2 + \sigma_{\Delta t_2}^2\right) \tag{12}$$

where  $\sigma_{\Delta t_1}$  and  $\sigma_{\Delta t_2}$  denote the rms jitter of  $V_R$  on the rising and falling transitions, respectively. Divided by  $K_{PD}^2$ , this noise is referred to the PD input as

$$\phi_{n,\text{in,rms}}^{2} = \frac{\pi^{2}}{T_{\text{REF}}^{2}} (\sigma_{\Delta t_{1}}^{2} + \sigma_{\Delta t_{2}}^{2}).$$
(13)



Fig. 7. (a) DSPD, (b) its time-domain waveforms, and (c) its simulated phase noise.



Fig. 8. DSPD detecting (a) rising edge of  $V_R$  or (b) falling edge of  $V_R$ .

To appreciate the significance of this result, we convert  $\phi_{n,in,rms}$  into jitter

$$\overline{\sigma_j^2} = \frac{\sigma_{\Delta t_1}^2 + \sigma_{\Delta t_2}^2}{4}.$$
(14)

That is, double-sampling in essence averages the jitter of the PD input rising and falling edges, providing a 3-dB reduction. This property applies to the jitter of both the crystal oscillator and the RBUF.

Plotted in Fig. 9 are the simulated phase noise profiles at the output of a noiseless PLL using employing our RBUF design and with single-sampling PD and DSPD. The PLL



Fig. 9. Simulated phase noise of RBUF in a noiseless PLL.



Fig. 10. DSPD response to RBUF supply noise.

bandwidth is about 10 MHz, and the feedback divide ratio is unity. We note that the phase noise of RBUF is lowered by 3 dB around 1-MHz offset in the latter case.

At low offsets, double-sampling reduces the phase noise by even greater factors, e.g., by 7 dB at 100 kHz; we explain this phenomenon below. The integrated jitter falls from 8.6 to  $5.8 \text{ fs}_{rms}$ .

The proposed PD also lowers the effect of RBUF supply noise dramatically. Unlike noise sources within an inverter, the supply noise modulates the output *duty cycle*, and the DSPD converts this effect into a common-mode perturbation. To illustrate this point, we begin with the RBUF output waveform,  $V_R$ , shown in Fig. 10 and recognize that a static supply change in  $+\Delta V_{DD}$  raises the slew rates while keeping the transition times fairly constant. As a result, the duty cycle increases. We observe that the values sampled by  $\phi_1$  on the rising and falling edges shift up together, introducing a common-mode change in  $\Delta V_3 = \Delta V_4$  in  $V_3$  and  $V_4$ . Most of this perturbation is rejected by the Gm stage. Verified experimentally (see Section IV), this property greatly eases the LDO output noise requirement.

If the supply noise frequency is high enough to cause substantial change from one  $V_R$  edge to the next, then the PD suppresses the result to a lesser extent. But such noise components can be filtered by means of moderately sized capacitors attached to the LDO output.

The common-mode effect described above also explains the large RBUF phase noise suppression observed at low offsets in Fig. 9. Recall from Section II-B that *both* the transistors in the buffer inject noise on the output rising and falling



Fig. 11. Effect of RBUF flicker noise.

edges. For example, the flicker noise current of  $M_1$  in Fig. 11 injects excess positive charge on the rising transition of  $V_R$ , thus shifting it upward. Another packet of positive charge is also deposited on  $C_L$  by  $M_1$  on the falling edge, shifting this transition upward as well.<sup>5</sup> The falling transition is delayed by approximately the same amount because this noise changes negligibly in a time interval of  $T_1 \approx T_{\text{REF}}/2$ . That is, the noise components injected by  $M_1$  on two consecutive edges are strongly correlated. As a result, in a manner similar to that in Fig. 10, the flicker noise of  $M_1$  and  $M_2$  translates to a CM error in  $V_3$  and  $V_4$  and is thus suppressed.

## C. PD Transfer Function and Phase Noise

The single-sampling circuit of Fig. 6(a) can be approximately modeled by the following transfer function [24]:

$$H_{\rm PD}(j\omega) = \frac{{\rm SR}_R}{2\pi \cdot f_{\rm REF}} \cdot \frac{1}{1 + \frac{C_2}{C_1 f_{\rm REF}} j\omega} \times e^{-j\omega T_{\rm REF}/2} \frac{\sin(\omega T_{\rm REF}/2)}{\omega T_{\rm REF}/2}.$$
 (15)

For the double-sampling counterpart, the gain rises by a factor of 2 but the remaining terms are unchanged. With a gain of  $SR_R/(\pi f_{REF}) = 39.5$  V/rad,  $f_{REF} = 250$  MHz,  $C_1 = 100$  fF, and  $C_3 = 40$  fF, the PD magnitude and phase responses are relatively flat across the bandwidth of 10 MHz chosen in this design. That is, the PD behavior negligibly affects the PLL dynamics.

The PD phase noise,  $\phi_{n,PD}$ , arises primarily from the samplers' kT/C noise. If  $C_1 = C_2$  and  $C_3 = C_4$  in Fig. 7(a), the noise voltage deposited on  $C_1$  is equal to  $(kT/C_1)^{1/2}$ , corresponding to a charge amount of  $(kTC_1)^{1/2}$ . This charge is next shared with  $C_3$ , yielding a voltage of  $(kTC_1)^{1/2}/(C_1+C_3)$ . The square of this value is added to the kT/C noise associated with the slave sampler, and the final result is multiplied by 2 for the differential output

$$V_{n,\text{out,rms}}^2 = 2 \left[ \frac{\text{kTC}_1}{(C_1 + C_3)^2} + \frac{\text{kT}}{C_3} \right].$$
 (16)

We must now divide this quantity by the square of the PD gain to obtain the equivalent phase noise. This gain,  $SR_R/(\pi f_{REF})$ , can be approximated as follows. When the voltage on  $C_1$  is around  $V_{DD}/2$ , the current available for





Fig. 12. V<sub>R</sub> DCE.



Fig. 13. V<sub>R</sub> DCC circuit.

charging it is given by  $(V_{\rm DD} - V_{\rm DD}/2)/(R_{\rm BUF} + R_{\rm sw})$ , where  $R_{\rm BUF}$  and  $R_{\rm sw}$  denote the buffer output resistance and the switch resistance, respectively. Thus,

$$\operatorname{SR}_{R} \approx \frac{V_{\mathrm{DD}}}{2(R_{\mathrm{BUF}} + R_{\mathrm{sw}})C_{1}}.$$
 (17)

From (9), (16), and (17), we compute the PD's jitter as

$$\phi_{in,,\text{PD,rms}}^{2} = \frac{\phi_{n,\text{PD,rms}}^{2} \cdot T_{\text{REF}}^{2}}{(2\pi)^{2} K_{\text{PD}}^{2}} = \frac{2kT}{V_{\text{DD}}^{2}} (R_{\text{BUF}} + R_{\text{sw}})^{2} C_{1}^{2} \bigg[ \frac{C_{1}}{(C_{1} + C_{2})^{2}} + \frac{1}{C_{2}} \bigg].$$
(18)

Note, however, that this jitter "power" resides in a frequency range of  $-f_{\text{REF}}/2$  to  $+f_{\text{REF}}/2$ . We must therefore divide  $\phi_{\text{in},\text{PD,rms}}^2$  by  $f_{\text{REF}}$ , subject the spectrum to the PLL transfer function, and integrate the result.

## D. Effect of Duty Cycle Error

The PD operation described in Section III-A tacitly assumes a duty cycle of 50% for the reference. Crystal oscillators, on the other hand, can suffer from some duty cycle error (DCE). We wish to determine how DCE affects the performance.

Consider the RBUF waveforms shown in Fig. 12(a), where the solid plot represents a duty cycle of 50% and the dashed plot a greater value. We observe two phenomena. First, samples A and B assume a higher common-mode level as the duty cycle increases. That is, for a sufficiently large DCE, the CM level approaches  $V_{DD}$  or zero, an issue resolved by designing the Gm stage in Fig. 5 so as to accommodate rail-to-rail inputs. Second, either A' or B' in Fig. 12 can land near  $V_{DD}$ , carrying



Fig. 14. Frequency-doubler circuit followed by a single-sampling PD.

little phase information and converting the circuit into a singlesampling PD. To avoid this difficulty, the input duty cycle can be adjusted such that the CM level of  $V_3$  and  $V_4$  in Fig. 7(a) remains near  $V_{\text{DD}}/2$ .

### E. Duty Cycle Detection and Correction

The task of duty cycle correction (DCC) has been widely studied [8], [28], achieving errors less than 0.004% [8]. An important advantage of the proposed DSPD is the simplicity that it affords for duty cycle *detection*. As explained above, the optimum duty cycle ensures that the CM level of  $V_3$  and  $V_4$  in Fig. 12, i.e.,  $(V_3 + V_4)/2$ , is around  $V_{DD}/2$ . Thus,  $(V_3 + V_4)/2 - V_{DD}/2$  serves as the DCE.

Fig. 13 shows the DCC loop. On-chip unity-gain buffers sense  $V_3$  and  $V_4$ , and resistors  $R_1$  and  $R_2$  provide their CM level at node N. For test and characterization flexibility, an off-chip op-amp compares the result with  $V_{DD}/2$  and adjusts the bias input of the RBUF. An external input port allows a perturbation to be applied to the loop so that its response can be studied (see Section IV).

## F. Double-Sampling Versus Frequency Doubling

Instead of the proposed PD, one can consider a frequency doubler along with a single-sampling topology (see Fig. 14). This approach, however, suffers from three drawbacks. First, the supply noise incurred by the XOR gate is *not* removed, a point of contrast to how double-sampling removes the RBUF noise (see Section III-B). Thus, the estimates in Section II-B apply here. The XOR exhibits less supply sensitivity due to the sharper transitions that it receives. According to simulations, the supply noise of the XOR gate should be less than 4 nV/(Hz)<sup>1/2</sup> for negligible contribution to the overall jitter. Second, the single-ended output sensed by the Gm stage in Fig. 14 makes the circuit sensitive to common-mode noise, whereas the PD of Fig. 7(a) mostly free from this issue. Third, as explained in Section III-E, double-sampling greatly eases the detection of the DCE.

#### G. VCO and $\div 2$ Stage

As shown in Fig. 15(a), the VCO uses a complementary LC topology with inductive tail resonance at the second harmonic.<sup>6</sup> Due to the lack of ultra-thick metal layers, the 93-pH inductor is realized as two metal-8 and metal-9 octagons in parallel. Fig. 15(b) shows plots of the simulated and measured free-running phase noise of the VCO. The discrepancy at low offsets is attributed to inaccuracies in the flicker noise



Fig. 15. (a) VCO implementations and (b) its simulated and measured phase noise.

model of the transistors. Nonetheless, this discrepancy leads to an error of less than 2 fs for the jitter of the overall PLL. The phase noise simulation is carried out by Cadence's "pss" and "pnoise" tools and is believed to be accurate. Thus, the discrepancy is more likely due to the device models than to simulation.

The  $\div 2$  stage following the VCO in Fig. 5 is realized using complementary CMOS (C<sup>2</sup>MOS) logic and shown in Fig. 16(a). Drawing 1.4 mW, the circuit exhibits the simulated output phase noise plotted in Fig. 16(b), which translates to a jitter of about 2.6 fs.

## H. Multimodulus Divider

Multimodulus dividers generally produce a great deal of phase noise because of the large number of asynchronous stages that they incorporate. It is possible to insert at the divider output a retiming flip-flop (FF) driven by the VCO so as to remove the divider's phase noise [29]. This method, however, is prone to failure with process, supply voltage, and temperature (PVT) variations.

To elaborate on this point, we begin with the "modular" divider shown in Fig. 17(a) [30], where  $L_j$  denotes a latch. For ease of illustration, we draw a four-stage example as shown in Fig. 17(b), follow it with a  $\div 2$  circuit (necessary for our PLL), and retime its output by means of FF<sub>0</sub>. We denote the delay of dual-modulus stage j by  $\Delta t_j$ . Constructing the circuit's waveforms as in Fig. 17(c), we observe that FF<sub>0</sub> avoids metastability if the total delay from CK<sub>in</sub> to CK<sub>5</sub> does

<sup>&</sup>lt;sup>6</sup>The resonance occurs with tail parasitics and is not tuned here. According to simulations, a 5% variability in the parasitics leads to a 0.9-dB rise in the VCO free-running phase noise at 1-MHz offset.



Fig. 16. (a)  $C^2MOS \div 2$  circuit and (b) its simulated phase noise at an input frequency of 20 GHz.

not exceed one period of  $CK_{in}$ . More specifically, this path introduces the CK-to-Q delay of four  $\div 2/3$  cells and one  $\div 2$  stage. To this total, we must add the setup time of  $FF_0$ , arriving at the following bound:

$$\Delta t_1 + \Delta t_2 + \dots + \Delta t_5 + t_{\text{setup}, \text{FF}_0} < 100 \text{ ps.}$$
(19)

Otherwise, the falling edges of  $CK_{in}$  and  $CK_5$  can coincide and make  $FF_0$  metastable, a condition that prohibits the system from locking.

Unfortunately, the condition expressed by (19) is difficult to meet even in the typical-typical corner of the process. The simulations of the extracted layout suggest a total delay of about 110 ps in this corner. To alleviate this issue, we recognize that  $CK_1$  in Fig. 17(b) is also available as a retiming command. We then interpose between the  $\div$ 2 stage and  $FF_0$ another FF and drive it by  $CK_1$  [see Fig. 17(d)]. Here,  $FF_1$ avoids metastability if the total delay from  $CK_1$  to  $CK_5$  is less than one period of  $CK_1$ 

$$\Delta t_2 +, \dots, +\Delta t_5 + t_{\text{setup}, \text{FF}_1} < 200 \text{ ps.}$$
(20)

For  $FF_0$ , on the other hand, the delay from  $CK_{in}$  to  $CK_1$  to  $CK_6$  plus the setup time of  $FF_0$  must remain less than 100 ps

$$\Delta t_1 + \Delta t_{\text{FF}_1} + t_{\text{setup},\text{FF}_0} < 100 \,\text{ps.}$$
(21)

Of the two conditions prescribed by (20) and (21), the former proves more stringent as the extracted layout in the slow–slow high-temperature corner yields a value of 120 ps for its lefthand side. To improve the robustness of the circuit, we add one more FF as shown in Fig. 18(a) obtaining

$$CK_{in} \downarrow L_{2} \downarrow L_$$



Fig. 17. (a) Modular divider, (b) multimodulus divider with one FF as retimer, (c) timing diagram, and (d) multimodulus divider with two FFs as retimers.

(d)

The proposed divider in Fig. 18(a) merits two remarks. First, the output,  $\phi_1$ , carries only the phase noise of CK<sub>in</sub> and FF<sub>0</sub>. Second, this method guarantees that the excess delay around the critical loop is no more than the delay of one divider cell and one FF.

Plotted in Fig. 18(b) are the divider output phase noise profiles before and after retiming FFs are added, suggesting a 16-dB reduction.<sup>7</sup> The integrated jitter falls from 19 to 3 fs.<sup>8</sup> Drawing 1.8 mW at 10 GHz (mostly in the input clock buffer), the circuit provides a divide ratio from 32 to 62.

The multimodulus divider blocks are realized by TSPC and CMOS circuits. Specifically, the first two  $\div 2/3$  stages, FF<sub>0</sub> and FF<sub>1</sub>, use the former type and the slower blocks the latter.

For a divide ratio of 20 GHz/250 MHz = 80, we could replace the cascade of  $\div 2/3$  stages with one  $\div 4$  block and one  $\div 5$  block. The staggered retiming method introduced in this article can also be applied to such a chain so as to eliminate the divider jitter. Nonetheless, the feedback divider

$$\Delta t_3 + \Delta t_4 + \Delta t_5 + t_{\text{setup},\text{FF}_2} < 400 \text{ ps}$$
  

$$\Delta t_2 + \Delta t_{\text{FF}_2} + t_{\text{setup},\text{FF}_2} < 200 \text{ ps}$$
  

$$\Delta t_1 + \Delta t_{\text{FF}_1} + t_{\text{setup},\text{FF}_0} < 100 \text{ ps}. \qquad (22) \text{ th}$$

<sup>8</sup>Simulations confirm our intuition that the phase noise is the same as in the case of using a single retiming FF.

<sup>&</sup>lt;sup>7</sup>The flicker noise in the phase noise spectrum is dominated by  $FF_0$ .



Fig. 18. (a) Proposed multimodulus divider with three FFs as retimers and (b) its simulated phase noise spectrum.



Fig. 19. (a) Nonoverlapping clock generator and (b) nonoverlapping clock waveform.

would need to be redesigned completely if the PLL must target a different output frequency, e.g., 18 or 22 GHz. In this respect, the multimodulus topology offers greater flexibility with negligible power penalty.

## I. Nonoverlapping Clock Generator

To minimize the ripple on the control voltage, the PD of Fig. 7(a) must avoid transparency between the master and slave



Fig. 20. Die photograph.



Fig. 21. Measured phase noise of the 250-MHz crystal oscillator.

samplers, requiring nonoverlapping clock phases. The challenge here is that conventional topologies, such as those based on cross-coupled gates, generate significant jitter. We must therefore avoid passing  $\phi_1$  through additional stages and yet generate  $\phi_2$ . This is accomplished as shown in Fig. 19(a), where latches  $L_1-L_3$  and delay stage  $\Delta T$  produce a signal  $\phi_0$  at 500 MHz, with a delay of  $\Delta T$  with respect to  $\phi_1$ . From the  $\phi_2$  and  $\overline{\phi_2}$  waveforms shown in Fig. 19(b), we observe a nonoverlap time of  $\Delta T$ , about 50 ps in this work.<sup>9</sup> We should note that  $\phi_0$  and  $\phi_2$  inherit the phase noise of the delay stage, but the master samplers in Fig. 7(a) rely on only  $\phi_1$  and  $\overline{\phi_1}$ . Since  $\phi_2$  and  $\overline{\phi_2}$  only transfer charge to the slave capacitors, their phase noise is not critical.

#### **IV. EXPERIMENTAL RESULTS**

The proposed PLL has been fabricated in the 28-nm CMOS technology. Fig. 20 shows a photograph of the die, where the active area measures approximately  $320 \times 310 \ \mu$ m. The prototype consumes 12 mW:7.2 mW in the VCO, 1.4 mW in the ÷2 stage, 1.8 mW in the multi-modulus divider, and 1.3 mW in the RBUF.<sup>10</sup> The power supply voltage of RBUF is 1.2 V and the rest of the PLL is supplied at 1 V. The loop is locked with a divide ratio of 80 and an output frequency of 20 GHz. The VCO has a gain of 120 MHz/V and a total tuning range of 450 MHz, allowing synthesis of only 20 GHz

<sup>&</sup>lt;sup>9</sup>We choose N = 32 as a simple example.

<sup>&</sup>lt;sup>10</sup>The dc current from the RBUF supply is 1.08 mA.



Fig. 22. Measured PLL output spectrum.

with a 250-MHz reference.<sup>11</sup> This range somewhat relaxes the oscillator power jitter trade-off and should be borne in mind in comparison to the prior art (see below). Measurements on ten chips reveal a frequency standard deviation of 140 MHz, which is well-contained within the tuning range. Nonetheless, if a wider range is desirable to accommodate greater PVT variations, one can multiplex two VCOs [31] with no power penalty. The PD can be configured to operate as a single-sampling or a double-sampling circuit.

The 250-MHz reference frequency is provided by Crystek's CRBSCS-01-250 crystal oscillator. Its phase noise is plotted in Fig. 21, exhibiting a value of -171.5 dBc/Hz at 1-MHz offset.

For ease of measurement, the output of the  $\div 2$  circuit, Div<sub>a</sub>, in Fig. 5 is used for characterization. Fig. 22 shows the measured spectrum, indicating a reference spur level of -72 dBc, which translates to -66 dBc at the VCO output.

Fig. 23(a) shows plots of the measured phase noise at the output of the  $\div 2$  circuit for single- and double-sampling. Due to our phase noise analyzer limitations, the  $\div 2$  output is applied to an off-chip  $\div 2$  circuit for phase noise measurements. The profile exhibits a plateau of about -137 dBc/Hz up to 10-MHz offset and falls to -152 dBc/Hz at 40-MHz offset; the phase noise at the VCO output is 12 dB higher. We observe that double-sampling lowers the profile by 2 dB from 100 kHz to 1 MHz and 1.5 dB from 1 to 3 MHz.<sup>12</sup> Since the VCO contribution remains the same,<sup>13</sup> the overall phase noise declines by less than 3 dB. The free-running VCO flicker noise corner is around 800 kHz, contributing negligible jitter after the loop is closed. As mentioned above, a PLL BW of 10 MHz is selected to minimize the sum of reference and VCO contributions. With this choice, the flicker noise component of the latter amounts to 6 fsrms. A BW of, e.g., 5 MHz would raise this to 10 fs<sub>rms</sub>. The phase noise analyzer used here, the Agilent E5052A, exhibits spurs in the spectrum that do not exist in the actual signal. As the



Fig. 23. Measured PLL phase noise (a) with the option "omit" and (b) with the option "spur."

measured spectrum shown in Fig. 22 indicates, our PLL output is free from spurs below an offset equal to  $f_{\text{REF}}$ . Nonetheless, the phase noise spectrum measured by Agilent E5052A and depicted in Fig. 23(b) suffers from spurs at 10 kHz, 20 kHz, 30 kHz, 60 kHz, 120 kHz, and 22 MHz. We should remark that some prior art includes the actual spurs, e.g., fractional spurs, in their jitter values. For example, the work in [32] includes fractional spurs in the jitter. Since our measurements prove that our PLL does not exhibit spurs, we enable "omit" on the equipment so that they are excluded. This practice has also been adopted by [2], [3], and [6].

The jitter integrated from 10 kHz to 40 MHz is equal to 20.86 fs. Equipment imitations do not allow measurement of the PLL phase noise above 100 MHz. With a phase noise of -152 dBc/Hz at 40-MHz offset, the worst case additional jitter from 40 to 100 MHz amounts to 8.8 fs<sub>rms</sub>. According to simulations, the crystal oscillator contributes 10 fs, the RBUF 6.2 fs, and the VCO 15 fs.

As explained in Section III-A, the RBUF supply rejection becomes critical unless the LDO feeding it provides is an extremely low output noise voltage. With double-sampling, on the other hand, this issue is greatly relaxed. This point is verified as follows. The supply voltage of the buffer is

<sup>&</sup>lt;sup>11</sup>The VCO frequency ranges from 19.72 to 20.17 GHz.

 $<sup>^{12}\</sup>mathrm{At}$  frequency offsets close to and higher than the PLL BW, the VCO dominates the phase noise, making the 3-dB improvement afforded by the proposed PD less pronounced.

<sup>&</sup>lt;sup>13</sup>The value of the Gm following the PD is adjusted for single- and double-sampling so as to keep the loop bandwidth constant.



Fig. 24. Measured spur level due to RBUF supply disturbance.

modulated by a sinusoid having a peak amplitude of 140 mV and a variable frequency. The corresponding spurs at the PLL output are then studied for single- and double-sampling. Fig. 24 shows plots of the measured spur levels as a function of the sinusoid's frequency, revealing an improvement of at least 20 dB.

The duty cycle and its correction circuit have been characterized by several tests. Since direct, accurate measurement of the duty cycle is difficult, we proceed as follows. We wish to plot the measured spur levels and total jitter as a function of the DCE (before the correction loop is closed). We can vary the duty cycle by adjusting the input bias voltage of RBUF but we cannot measure the exact DCE. We therefore use simulations to create a "lookup table" relating the DCE to this voltage. Shown in Fig. 25 are the measured results, revealing that the DCE should be maintained below roughly 0.1%. As explained in Section III-E, this is readily feasible in the proposed PD by simply monitoring the CM level of  $V_3$  and  $V_4$  and stabilizing it around  $V_{DD}/2$ . We should remark that this level can incur an error of several tens of millivolts and yet negligibly affect the output jitter. We observe from Fig. 25(a) that the measured reference spurs generally fall as the DCE approaches  $\pm 0.6\%$ . This effect is attributed to the fact that the PD gain drops at these extremes, lowering the loop bandwidth. This point also explains why the jitter rises as the DCE increases. Next, we enable the correction loop and apply an external step as illustrated in Fig. 13. Shown in Fig. 26, the transient response of V<sub>PD,CM</sub> reveals that this voltage jumps by 250 mV but returns to 550 mV ( $\approx V_{DD}/2$ ).

To study the robustness of the PLL, we apply to the VCO supply voltage an external square wave having a peak-to-peak amplitude of 300 mV. The Agilent E5052A signal analyzer captures the frequency transient.<sup>14</sup> Plotted in Fig. 27 is the result, indicating that the loop relocks with such large supply noise.<sup>15</sup>

Table I presents the measured performance of our prototype and compares it with that of other PLLs that have achieved sub-60-fs jitter values. The jitter is reduced by more than a factor of 2, and the FoM is improved by 4.1 dB.



Fig. 25. (a) Measured spur levels and (b) PLL output jitter as a function of the reference DCE.



Fig. 26. Measured transient response of VPD.CM.

As explained in Section II, the reference phase noise and frequency play a significant role in the performance of PLLs. For this reason, the crystal oscillator power consumption also becomes problematic. According to our measurements,

 $<sup>^{14}\</sup>text{Due}$  to this equipment's limitations, we precede it with an external  $\div$  2 stage.

<sup>&</sup>lt;sup>15</sup>The ripple in the relocking process is caused by the nonlinear behavior of the DSPD with the input phase error greater than the linear region.



Fig. 27. Measured PLL frequency transient response.

 TABLE I

 Performance Summary and Comparison to Prior Art

|                         | Zhang               | Gong             | Mercandelli       | Turker                   | This      |
|-------------------------|---------------------|------------------|-------------------|--------------------------|-----------|
|                         | <b>ISSCC 2019</b>   | <b>RFIC 2020</b> | <b>ISSCC 2020</b> | ISSCC 2018               | Work      |
| Architecture            | Sub–sampling<br>PLL | Charge           | Single-           | Charge-pump<br>based PLL | Double    |
|                         |                     | Sampling         | Sampling          |                          | Sampling  |
|                         |                     | PLL              | PLL               |                          | PLL       |
| Ref. Freq.(MHz)         | 200                 | 100              | 500               | 500                      | 250       |
| Freq. Range (GHz)       | 12 ~ 16             | 9.8 ~ 12.2       | 11.9 ~ 14.1       | 7.4 ~ 14                 | 20        |
| RMS Jitter (fs)         | 56.4                | 50.5             | 51.7 <sup>2</sup> | 53.6                     | 20.9      |
| Integ. range (MHz)      | (0.001~100)         | (0.001~100)      | (0.001~100)       | (0.01~10)                | (0.01~40) |
| Ref. Spur (dBc)         | -64.6               | -65.7            | -73.5             | -75.5                    | -66       |
| Power (mW)              | 7.2                 | 5                | 18                | 45                       | 12        |
| Area (mm <sup>2</sup> ) | 0.234               | 0.13             | 0.16              | 0.45                     | 0.06      |
| Tech. (nm)              | 40                  | 40               | 28                | 16                       | 28        |
| FoM <sup>1</sup> (dB)   | -256.4              | -258.9           | -253.2            | -248.9                   | -262.8    |
| Crystal Osc.            | N/A                 | 150 <sup>3</sup> | 175 <sup>4</sup>  | N/A                      | 170       |
| Power (mW)              |                     |                  |                   |                          |           |

1: FoM =  $10\log_{10}\left[\left(\frac{\text{Jitter}}{1 \text{ s}}\right)^2\left(\frac{\text{Power}}{1 \text{ mW}}\right)\right]$  2: Integer-N Jitte

3: From datasheet of Taitien VLCU-type series

4: From private communication with author and datasheet of Crystek CCSO-914X-500

Crystek's CRBSCS-01-250 draws about 170 mW. Shown in Table I are the crystal oscillator power consumptions.

## V. CONCLUSION

As high-speed applications demand jitter values in the range of a few tens of femtoseconds, we face daunting challenges in PLL design. The crystal oscillator, the RBUF, and the VCO become the main contributors. This work introduces a new phase detector and a self-retimed frequency divider that ease the trade-offs in PLLs.

#### ACKNOWLEDGMENT

The authors gratefully acknowledge the TSMC University Shuttle Program for chip fabrication.

#### REFERENCES

- A. M. A. Ali et al., "16.1 A 12b 18GS/s RF sampling ADC with an integrated wideband track-and-hold amplifier and background calibration," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC)*, Feb. 2020, pp. 250–252.
- [2] J. Gong, F. Sebastiano, E. Charbon, and M. Babaie, "A 10-to-12 GHz 5 mW charge-sampling PLL achieving 50 fsec RMS jitter, -258.9 dB FOM and -65 dBc reference spur," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Aug. 2020, pp. 15–18.
- [3] M. Mercandelli et al., "17.5 A 12.5 GHz fractional-N type-I sampling PLL achieving 58fs integrated jitter," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC)*, Feb. 2020, pp. 274–276.
- [4] D. Turker et al., "A 7.4-to-14 GHz PLL with 54fs<sub>rms</sub> jitter in 16 nm FinFET for integrated RF-data-converter SoCs," in *Proc. IEEE Int. Solid State Circuits Conf. (ISSCC)*, Feb. 2018, pp. 378–380.
- [5] Z. Zhang, G. Zhu, and C. P. Yue, "A 0.65V 12-to-16 GHz sub-sampling PLL with 56.4fs<sub>rms</sub> integrated jitter and -256.4dB FoM," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC)*, Feb. 2019, pp. 488–490.

- [6] A. Santiccioli et al., "17.2 A 66fs<sub>rms</sub> Jitter 12.8-to-15.2 GHz fractional-N bang-bang PLL with digital frequency-error recovery for fast locking," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC)*, Feb. 2020, pp. 268–270.
- [7] Z. Yang et al., "A 25.4-to-29.5 GHz 10.2 mW isolated sub-sampling PLL achieving -252.9dB jitter-power FoM and -63dBc reference spur," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC)*, Feb. 2019, pp. 270-272.
- [8] W. Wu et al., "A 28-nm 75-fs<sub>rms</sub> analog fractional-N sampling PLL with a highly linear DTC incorporating background DTC gain calibration and reference clock duty cycle correction," *IEEE J. Solid-State Circuits*, vol. 54, no. 5, pp. 1254–1265, Mar. 2019.
- [9] E. Thaller et al., "32.6 A K-band 12.1-to-16.6 GHz subsampling ADPLL with 47.3fs<sub>rms</sub> jitter based on a stochastic flash TDC and coupled dualcore DCO in 16 nm FinFET CMOS," in *Proc. IEEE Int. Solid- State Circuits Conf. (ISSCC)*, Feb. 2021, pp. 451–453.
- [10] Y. Hu et al., "A charge-sharing locking technique with a general phase noise theory of injection locking," *IEEE J. Solid-State Circuits*, vol. 57, no. 2, pp. 518–534, Feb. 2022.
- [11] J. Kim et al., "A 76fs<sub>rms</sub> jitter and -40dBc integrated-phase-noise 28to-31 GHz frequency synthesizer based on digital sub-sampling PLL using optimally spaced voltage comparators and background loop-gain optimization," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Jun. 2019, pp. 258–260.
- [12] Y. Lim et al., "17.8 Å 170 MHz-lock-in-range and -253dB-FoM<sub>jitter</sub> 12-to-14.5 GHz subsampling PLL with a 150μW frequency-disturbance-correcting loop using a low-power unevenly spaced edge generator," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC)*, Feb. 2020, pp. 280–282.
- [13] X. Gao, E. Klumperink, G. Socci, M. Bohsali, and B. Nauta, "A 2.2 GHz sub-sampling PLL with 0.16ps<sub>rms</sub> jitter and -125 dBc/Hz in-band phase noise at 700μW loop-components power," in *Proc. Symp. VLSI Circuits Dig. Tech. Papers*, 2010, pp. 139–140.
- [14] K. Raczkowski, N. Markulic, B. Hershberg, and J. Craninckx, "A 9.2–12.7 GHz wideband fractional-N subsampling PLL in 28 nm CMOS with 280 fs RMS jitter," *IEEE J. Solid-State Circuits*, vol. 50, no. 5, pp. 1203–1213, May 2015.
- [15] A. Sharkia, S. Mirabbasi, and S. Shekhar, "A type-I sub-sampling PLL with a  $100 \times 100 \ \mu m^2$  footprint and -255-dB FOM," *IEEE J. Solid-State Circuits*, vol. 53, no. 12, pp. 3553–3564, Jul. 2018.
- [16] D.-G. Lee and P. P. Mercier, "A sub-mW 2.4-GHz active-mixer-adopted sub-sampling PLL achieving an FoM of -256 dB," *IEEE J. Solid-State Circuits*, vol. 55, no. 6, pp. 1542–1552, Jun. 2020.
- [17] X. Gao, E. A. M. Klumperink, M. Bohsali, and B. Nauta, "A low noise sub-sampling PLL in which divider noise is eliminated and PD/CP noise is not multiplied by N<sup>2</sup>," *IEEE J. Solid-State Circuits*, vol. 44, no. 12, pp. 3253–3263, Dec. 2009.
- [18] S. Ye, L. Jansson, and I. Galton, "A multiple-crystal interface PLL with VCO realignment to reduce phase noise," *IEEE J. Solid-State Circuits*, vol. 37, no. 12, pp. 1795–1803, Dec. 2002.
- [19] A. Elkholy, M. Talegaonkar, T. Anand, and P. K. Hanumolu, "Design and analysis of low-power high-frequency robust sub-harmonic injectionlocked clock multipliers," *IEEE J. Solid-State Circuits*, vol. 50, no. 12, pp. 3160–3174, Dec. 2015.
- [20] A. Elkholy, D. Coombs, R. K. Nandwana, A. Elmallah, and P. K. Hanumolu, "A 2.5–5.75-GHz ring-based injection-locked clock multiplier with background-calibrated reference frequency doubler," *IEEE J. Solid-State Circuits*, vol. 54, no. 7, pp. 2049–2058, Jul. 2019.
- [21] M.-S. Choo et al., "A PVT variation-robust all-digital injection-locked clock multiplier with real-time offset tracking using time-division dual calibration," *IEEE J. Solid-State Circuits*, vol. 56, no. 8, pp. 2525–2538, Aug. 2021.
- [22] Y. Zhao and B. Razavi, "A 19-GHz PLL with 20.3-fs jitter," in Proc. Symp. VLSI Circuits, Jun. 2021, pp. 1–2.
- [23] A. Homayoun and B. Razavi, "Analysis of phase noise in phase/frequency detectors," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 60, no. 3, pp. 529–539, Mar. 2013.
- [24] L. Kong and B. Razavi, "A 2.4 GHz 4 mW integer-N inductorless RF synthesizer," *IEEE J. Solid-State Circuits*, vol. 51, no. 3, pp. 626–635, Mar. 2016.
- [25] L. Kong, Y. Chang, and B. Razavi, "An inductorless 20-Gb/s CDR with high jitter tolerance," *IEEE J. Solid-State Circuits*, vol. 54, no. 10, pp. 2857–2866, Oct. 2019.
- [26] J. Sharma and H. Krishnaswamy, "A 2.4-GHz reference-sampling phaselocked loop that simultaneously achieves low-noise and low-spur performance," *IEEE J. Solid-State Circuits*, vol. 54, no. 5, pp. 1407–1424, May 2019.

- [27] J. Du et al., "A 24–31 GHz reference oversampling ADPLL achieving FoM<sub>jitter-N</sub> of -269.3 dB," in *Proc. Symp. VLSI Circuits Dig. Tech. Papers*, 2021, pp. 1–2.
- [28] X. Gao et al., "A 28 nm CMOS digital fractional-N PLL with -245.5dB FOM and a frequency tripler for 802.11ABGN/Ac radio," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2015, pp. 1–3.
- [29] L. Romano et al., "Low jitter design of a 0.35 μm-CMOS frequency divider operating up to 3 GHz," in *Proc. 28th Eur. Solid-State Circuits Conf.*, 2002, pp. 611–614.
- [30] C. S. Vaucher, I. Ferencic, M. Locher, S. Sedvallson, U. Voegeli, and Z. Wang, "A family of low-power truly modular programmable dividers in standard 0.35-µm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 35, no. 7, pp. 1039–1045, Jul. 2000.
- [31] A. Kral, F. Behbahani, and A. A. Abidi, "RF-CMOS oscillators with switched tuning," in *Proc. IEEE Custom Integr. Circuits Conf.*, May 1998, pp. 555–558.
- [32] S. M. Dartizio et al., "A 68.6fs<sub>r</sub>ms-total-integrated-jitter and 1.56µslocking-time fractional-N bang-bang PLL based on type-II gear shifting and adaptive frequency switching," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, vol. 65, Dec. 2022, pp. 1–3.



Yu Zhao (Member, IEEE) received the B.S. degree from Shanghai Jiao Tong University, Shanghai, China, in 2013, and the M.S. and Ph.D. degrees in electrical engineering from the University of California at Los Angeles, Los Angeles, CA, USA, in 2015 and 2022, respectively.

He was with Ubilinx Technology, Inc., San Jose, CA, USA, from 2015 to 2018, working on frequency synthesizers for Bluetooth and WiFi 6. He is currently with the RFIC Design Team, HiSilicon, Shanghai. His research interests include low-jitter for wireling transceiver.

frequency synthesizers for wireless and wireline transceivers.



Mahdi Forghani (Graduate Student Member, IEEE) received the B.Sc. degree in electrical engineering from the Sharif University of Technology, Tehran, Iran, in 2015, and the M.Sc. degree in electrical and computer engineering from Rice University, Houston, TX, USA, in 2017. He is currently pursuing the Ph.D. degree with the University of California at Los Angeles (UCLA), Los Angeles, CA, USA.

He was an RFIC Design Intern with Apple Inc., Cupertino, CA, USA, in 2020. His research interests include analog, RF, and millimeter-wave integrated circuit design for high-speed wireless and wireline transceivers.

Mr. Forghani was a recipient of the Texas Instruments Distinguished Student Fellowship in 2015.



**Behzad Razavi** (Fellow, IEEE) received the B.S. degree from the Sharif University of Technology, Tehran, Iran, in 1985, and the M.S. and Ph.D. degrees from Stanford University, Stanford, CA, USA, in 1988 and 1992, respectively, all in electrical engineering.

He was an Adjunct Professor with Princeton University, Princeton, NJ, USA, from 1992 to 1994, and with Stanford University in 1995. He was with AT&T Bell Laboratories and Hewlett-Packard Laboratories until 1996. Since 1996, he has been an

Associate Professor and subsequently a Professor of electrical engineering with the University of California at Los Angeles, Los Angeles, CA, USA. He has authored *Principles of Data Conversion System Design* (IEEE Press, 1995), *RF Microelectronics* (Prentice Hall, 1998, 2012) (translated to Chinese, Japanese, and Korean), *Design of Analog CMOS Integrated Circuits* (McGraw-Hill, 2001, 2016) (translated to Chinese, Japanese, and Korean), *Design of Integrated Circuits for Optical Communications* (McGraw-Hill, 2003, Wiley, 2012), *Design of CMOS Phase-Locked Loops* (Cambridge University Press, 2020), and *Fundamentals of Microelectronics* (Wiley, 2006, 2014, 2021) (translated to Korean, Portuguese, and Turkish), and is an Editor of *Monolithic Phase-Locked Loops and Clock Recovery Circuits* (IEEE Press, 1996) and *Phase-Locking in High-Performance Systems* (IEEE Press, 2003). His current research interests include wireless and wireline transceivers and data converters.

Dr. Razavi is a member of the U.S. National Academy of Engineering and a fellow of the U.S. National Academy of Inventors. He received the Beatrice Winner Award for Editorial Excellence at the 1994 International Solid-State Circuits Conference (ISSCC), the Best Paper Award at the 1994 European Solid-State Circuits Conference, the Best Panel Award at the 1995 and 1997 ISSCC, the TRW Innovative Teaching Award in 1997, the Best Paper Award at the IEEE Custom Integrated Circuits Conference in 1998, and the McGraw-Hill First Edition of the Year Award in 2001. He was a co-recipient of both the Jack Kilby Outstanding Student Paper Award and the Beatrice Winner Award for Editorial Excellence at the 2001 ISSCC. He received the Lockheed Martin Excellence in Teaching Award in 2006, the UCLA Faculty Senate Teaching Award in 2007, and the CICC Best Invited Paper Award in 2009 and 2012. He was a co-recipient of the 2012 and 2015 VLSI Circuits Symposium Best Student Paper Awards and the 2013 CICC Best Paper Award. He was also recognized as one of the top ten authors in the 50-year history of ISSCC. He received the 2012 Donald Pederson Award in Solid-State Circuits. He was also a recipient of the American Society for Engineering Education PSW Teaching Award in 2014 and the 2017 IEEE CAS John Choma Education Award. He served on the Technical Program Committees of the ISSCC from 1993 to 2002 and the VLSI Circuits Symposium from 1998 to 2002. He has also served as a Guest Editor and an Associate Editor for the IEEE JOURNAL OF SOLID-STATE CIRCUITS, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, and International Journal of High Speed Electronics. He served as the Founding Editor-in-Chief for the IEEE SOLID-STATE CIRCUITS LETTERS. He has served as an IEEE Distinguished Lecturer.