## ISSCC 2022 / SESSION 17 / ADVANCED WIRELINE LINKS AND TECHNIQUES / 17.4

### 17.4 A 56GHz 23mW Fractional-N PLL with 110fs Jitter

Yu Zhao, Onur Memioglu, Behzad Razavi

University of California, Los Angeles, CA

PAM-4 wireline transmitters operating at 224Gb/s can employ a 56GHz PLL for multiplexing. Such an environment poses several constraints on the design. First, the PLL rms jitter must be no more than a few percent of the symbol period, 8.93ps, dictating values around 100fs<sub>rms</sub>. Second, the PLL should preferably provide fractional-N operation so as to accommodate different crystal frequencies. Third, in a multi-lane system, it is desirable to avoid distributing a 56GHz clock over long interconnects, hence the need for a lower-power, compact PLL that can be used within each lane. Prior fractional-N designs in this frequency range have achieved rms jitters ranging from 200 to 500fs while consuming beween 31 and 46mW and occupying areas from 0.38 to 0.55mm<sup>2</sup> [1-4]. This paper introduces a PLL with a jitter of 110fs that draws 23mW and occupies an area of 0.1mm<sup>2</sup> in 28nm CMOS technology.

Fractional-N PLLs typically face two severe issues, namely, noise folding due to charge pump (CP) nonlinearity, and a tight trade-off between the VCO phase noise and the  $\Delta\Sigma$  quantization noise as dictated by the loop bandwidth. The former can be resolved through the use of sampling or double-sampling PDs [5-6], and the latter is alleviated if a digital-to-time converter (DTC) cancels the  $\Delta\Sigma$  noise or an FIR filter suppresses it [5]. To avoid DTC gain error and non-linearity, we opt for FIR filtering but note that the resistor-based topology in [5] suffers from nonlinearity in the phase domain and  $\Delta\Sigma$  noise folding. We therefore propose a switched-current FIR filter that achieves much higher linearity.

Figure 17.4.1 shows the PLL architecture. A switched-current FIR circuit acts as both a quantization noise filter and a phase detector, and is followed by a sampler, a  $G_m$  stage, a loop filter, and a VCO. The  $G_m$  stage provides a voltage gain of 30dB at DC, relaxing the voltage compliance at the FIR filter output. The feedback path consists of a low-power, compact +8 circuit and a multi-modulus divider (MMD) driven by a 1-1-1 MASH  $\Delta\Sigma$  is modulator. Despite the limited speed of the 28nm devices, the PLL employs only one inductor (in the VCO) so as to occupy a small footprint. The frequency planning of the gedback path is governed by the trade-off between noise and power consumption: a divider ratio greater than 8 would lead to less power drained by the MMD and the FIR circuit but at the cost of greater  $\Delta\Sigma$  quantization noise. The loop bandwidth is about 4MHz and the output frequency resolution around 2kHz.

Before describing the complete FIR filter, we consider a two-tap example that generates from an input, x(t), an output of the form  $\alpha_1 x(t) + \alpha_2 x(t-T_{REF})$ , where  $T_{REF}$  denotes the denotes the MMD output due to the  $\Delta\Sigma$  modulator, and the output should be a voltage or current quantity. This can be accomplished as shown in Fig. 17.4.2, where the feedback signal,  $V_{F_A}$  and its delayed copy,  $V_{F\Delta}$ , control two current sources. Phase jump  $\Delta t_a$  is delayed by  $T_{REF}$  and "combined" with the next phase jump  $\Delta t_b$ . Specially, if C<sub>1</sub> begins with a zero initial condition, we have  $V_{out} = (I_2/C_1)(t-\Delta t_a) + (I_1/C_1)(t-\Delta t_b) = -(I_2/C_1)\Delta t_a - (I_1/C_1)\Delta t_b + I_2/C_1]t$ . Viewing  $\Delta t_b$  as x(t) and  $\Delta t_a$  as  $x(t-T_{REF})$ , we observe that  $V_{out}$  provides a twotap FIR response, with  $\alpha_1=-I_1/C_1$  and  $\alpha_2=-I_2/C_1$ . In order to perform phase comparison with the reference, we sample  $V_{out}$  by means of the reference,  $V_{REF}$ , at  $t=t_s$ . Thus,  $V_s$ contains an integrated value from t=0 to  $t_s$  representing the reference phase minus the two terms found above involving  $\Delta t_a$  and  $\Delta t_b$ .

The proposed phase-domain FIR filter and PD merit several remarks. First, the summation necessary for FIR action naturally occurs in the current domain, which, as  $\mathcal{S}$  explained below, dramatically relaxes the linearity requirement at the output node. Second, the departure of  $I_1$  and  $I_2$  from their ideal values alters the filter coefficients, slightly affecting the frequency response, but it does not introduce nonlinearity, a significant advantage over charge pumps. Third, the finite output resistance of the current E sources in Fig. 17.4.3 leads to nonlinearity. However, a remarkable property of the proposed topology is that its total output current benefits from FIR spectral shaping  $\frac{1}{2}$  before it reaches the output node and experiences the code-dependent resistance. With the  $\Delta\Sigma$  noise suppressed by the FIR action, the output nonlinearity introduces negligible  $\vec{a}$  folding. In other words, the residual noise has much lower amplitude (equivalent to  $\vec{a} = 0.3T_{\rm e}$ ) as it arrives at the FIB output popularity. Moreover, it can be shown that the ±0.3T<sub>div</sub>) as it arrives at the FIR output nonlinearity. Moreover, it can be shown that the amplitude of the distortion components is inversely proportional to the output time E constant, which is chosen to be about two orders of magnitude greater than the phase  $\overline{\underline{w}}$  jumps at the MMD output. This point also stands in contrast to CP nonlinearity, which 🖳 cannot be reduced by adjusting its output time constant. 2022

The complete FIR filter/PD is depicted in Fig. 17.4.3. The core consists of 22 cascode current sources, with integer weighting factors k<sub>1</sub>-k<sub>22</sub> chosen so as to create a Chebychev response having zeros at 11MHz and its harmonics. This response is advantageous as it requires a lower resolution for  $k_1$ - $k_{22}$  than other response types, such as Kaiser or Hamming. In this work,  $k_{min}$ =3 and  $k_{max}$ =10. The cascode switched-current cell employs a timing scheme that halves the power consumption and yet alleviates the memory effects. Initially, both M<sub>3</sub> and M<sub>4</sub> are off. Next, M<sub>3</sub> turns on, bringing V<sub>A</sub> down to its desired value and finally, M<sub>4</sub> turns on and M<sub>3</sub> turns off, allowing C<sub>1</sub> to charge. The proposed topology in Fig. 17.4.3 incorporates 21 delay elements and 22 NAND gates to apply FIR filtering to the phase difference between the reference and the MMD output. To avoid phase noise accumulation, each delay stage is realized by a chain of 29 TSPC flipflops (rather than asynchronous delay lines), producing discrete values equal to integer multiples of  $T_{div}=1/7$ GHz. This power-efficient method, however, leads to phase error accumulation because the reference period and the MMD input period bear a ratio of N+a, where  $\alpha$  is the  $\Delta\Sigma$  modulator's frequency command word (FCW). As illustrated by the waveforms in Fig. 17.4.3 on the left, the accumulation can reach  $21\alpha T_{div}$ . To resolve this issue, the delay elements assume a binary value of either  $T_1=28T_{div}$  or  $T_2=29T_{div}$  so as to create a tight bound for this error. Programmed individually in conjunction with FCW, the delay of Stage j is set according to the following rules: if the accumulated error from Stage 1 to Stage j (predicted by  $(j-1)\alpha T_{div}$ ) is less than  $T_{div}$ , then  $T_1=28T_{div}$  is selected; otherwise,  $T_2$ =29 $T_{div}$  is used. As shown by the waveforms on the right, the last FIR phase,  $\Phi_{22}$ , experiences a difference of only about  $T_{div}$ , with respect to the others.

The efficacy of the proposed FIR architecture can be assessed by several metrics: (1) the  $\Delta\Sigma$  noise spectrum is reduced by 18dB at 10MHz; (2) the integrated  $\Delta\Sigma$  noise is suppressed by 12dB; and (3) the probability density function of the phase error is narrowed from ±2T<sub>div</sub> at the MMD output to (equivalently) ±0.3T<sub>div</sub> at the FIR output.

The proposed fractional-N PLL is fabricated in TSMC 28nm CMOS technology. Figure 17.4.7 shows the die photograph. The 250MHz reference is provided by Crystek's CRBSCS-01-250 crystal oscillator. For ease of measurements, the output spectrum of the ÷8 circuit in Fig. 17.4.1 is monitored. Figure 17.4.4 shows the power consumption breakdown and the divider output spectrum. The fractional spur at 2MHz offset has a level of -66dBc, which translates to -48dBc at the VCO output and hence about 16fs of rms deterministic jitter. Fortunately, with receive CDR bandwidths of tens of megahertz, such low-frequency spurs are rejected. Figure 17.4.5 plots the fractional spur levels as the FCW varies from 0.004 to 0.06 and the offset frequency varies from 1MHz to 15MHz.

The measured phase noise of the  $\div$ 8 output is shown in Figure 17.4.5. Due to our phase noise analyzer limitations, the  $\div$ 8 output is applied to an off-chip divide-by-2 circuit for phase noise measurements. The integrated jitter is computed in two different bandwidths. First, for a fair comparison with the prior art, an offset range from 10kHz to 40MHz is used, yielding a total of 110fs<sub>rms</sub>. Second, the offset range from 40MHz to the Nyquist frequency of 3.5GHz is computed separately, introducing another 31fs. (Due to the equipment limitation, this measurement reads the phase noise values directly from the spectrum of the  $\div$ 8 output.) Thus, the total jitter from 10kHz to 3.5GHz is 114fs<sub>rms</sub>.

The table in Figure 17.4.6 compares the measured performance of our prototype to that of other state-of-the-art PLLs operating around 60GHz. We observe a nearly twofold reduction in jitter, an 8.3dB improvement in the FoM, and a more than threefold reduction in area.

#### Acknowledgement:

Work supported by Realtek Semiconductor and TSMC University Shuttle Program.

#### References:

[1] W. Wu et al., "A 56.4-to-63.4GHz spurious-free all-digital fractional-N PLL in 65nm CMOS," *ISSCC*, pp. 352-353, Feb. 2013.

[2] A. Hussein et al., "A 50-to-66GHz 65nm CMOS all-digital fractional-N PLL with 220fs<sub>rm</sub> jitter," *ISSCC*, pp. 326-327, Feb. 2017.

[3] L. Grimaldi et al., "A 30GHz Digital Sub-Sampling Fractional-N PLL with 198fs<sub>rms</sub> Jitter in 65nm LP CMOS," ISSCC, pp. 268-270, Feb. 2019.

[4] Z. Zong et al., "A Low-Noise Fractional- N Digital Frequency Synthesizer With Implicit Frequency Tripling for mm-Wave Applications," *IEEE JSSC*, vol. 54, no. 3, pp. 755-767, Mar. 2019.

[5] L. Kong and B. Razavi, "A 2.4GHz RF fractional-N synthesizer with 0.25f<sub>REF</sub> BW," *ISSCC*, pp. 330-331, Feb. 2017.

[6] Y. Zhao and B. Razavi, "A 19-GHz PLL with 20.3-fs Jitter," *IEEE Symp. VLSI Circuits*, pp. 1-2, June 2021.

### ISSCC 2022 / February 23, 2022 / 9:00 AM



# **ISSCC 2022 PAPER CONTINUATIONS**

| Figure 17.4.7: Die micrograph. |  |
|--------------------------------|--|
|                                |  |
|                                |  |