# A Low-Power 28-GHz Beamforming Receiver with On-Chip LO Synthesis

Pawan K. Khanna, Yu Zhao, Mahdi Forghani, and Behzad Razavi

Electrical and Computer Engineering Department, University of California, Los Angeles, CA 90095, USA

pawankk@ucla.edu

Abstract—This paper introduces a new beamforming technique that avoids the trade-off between loss, power consumption, and phase shift resolution. The eight-element receiver draws 156 mW, achieving a minimum noise figure of 3.7 dB, a phase resolution of  $11.7^{\circ}$ , and an LO jitter of 155 fs<sub>rms</sub>.

Keywords-5G RX, phase shift, beamforming

## I. INTRODUCTION

The use of millimeter-wave communications in 5G radios becomes viable if (1) extensive beamforming is employed to overcome the high path loss and (2) the power consumption is sufficiently low to afford frequent high-throughput connections for mobile devices. Recent beamforming receivers in the vicinity of 28 GHz draw, per element, 27.5 mW [1] to 50 mW [2]. Moreover, the reported receivers do not include complete on-chip LO synthesis.

A key observation in the design of beamforming receivers is that the phase shift network typically consumes high power whether it appears in the RF path or the LO path. This paper introduces a new, low-power beamforming technique and several other concepts that reduce the power with no noise figure penalty.

## **II. PHASE SHIFTER ARCHITECTURE**

# A. Basic Idea

Depicted in Fig. 1(a) is the essence of the proposed RF phase shift idea. A common-source transistor delivers the RF current to a properly-terminated transmission line (T-line), and the output voltage tap can slide along this line. We note three key attributes of this approach. First, the T-line, in principle, provides an arbitrarily high phase resolution. Second, driving the T-line by a current source consumes less power than by a voltage source as the latter would require a stage having a sufficiently low output impedance. Third, our method leads to a true time delay scheme, accommodating wide channel bandwidths. It is important to distinguish between the proposed configuration and the T-line used in [3], which realizes phase-shift switching by digitally changing the structure of the T-Line to change its delay. However, this method requires that the T-line loss be increased deliberately for shorter delays, demanding 103 mW of power per element.

The T-line in Fig. 1(a) must exhibit a high characteristic impedance,  $Z_0$ , so that the stage achieves a reasonable voltage gain,  $g_{m1}R_L = g_{m1}Z_0$ , with moderate input capacitance and power consumption. We thus seek a distributed LC



Fig. 1: (a) Basic proposed phase shift method, (b) T-line structure.

approximation of the T-line as practiced in [4] and [5] but targeting a high  $Z_0$ . This thought suggests that the inductance per section of the T-line,  $L_U$ , must be raised to the point where the parasitic capacitance of the inductor,  $C_U$ , yields a maximum for  $Z_0 = \sqrt{L_U / C_U}$  [Fig. 1(b)]. The design of the unit inductor and hence the T-line is governed by four parameters: unit inductance, characteristic impedance, loss, and total area; the last one is important because we shall use two T-lines for differential operation and  $180^{\circ}$  phase reversal. Among various spiral inductor structures, the stacked geometry proves particularly useful in this regard as it affords an inductance increase proportional to the square of the number of layers [6]. Consequently, the phase shift per section,  $2 \tan^{-1}(\omega \sqrt{L_U C_U} / 2)$ , also increases.

# B. Optimization

To arrive at an optimal geometry, we consider single-layer (metal-9) or stacked inductors consisting of two or three layers. Since metal 9 and metal 8 have 6% sheet resistance of metal 7, we place metal-6 and metal-5 spirals in parallel with metal 7 for the case of three stacked inductors. To quantify the loss, we cascade a sufficient number of sections to obtain a 180° phase shift [Fig. 2(a)]. Simulated in Cadence's EMX at 28 GHz, these designs yield the results depicted in Fig. 2(b) and Fig. 2(c). The inductance plot indicates that a unit value of around 800 pH translates to lateral dimensions of 20  $\mu$ m × 20  $\mu$ m for three stacked spirals and a remarkably high Z<sub>0</sub> of 410  $\Omega$ . However, Fig. 2(c) reveals a loss of 7 dB. With two stacked spirals, on the other hand, we achieve Z<sub>0</sub> = 390  $\Omega$  and a loss of 4.7 dB with a slightly larger footprint of 26  $\mu$ m × 26  $\mu$ m. We select this structure as a reasonable compromise.

To the best of our knowledge, this is the highest  $Z_0$  achieved for an on-chip T-line. This value affords a bias current as low as 3.2 mA for  $M_1$  in Fig. 1. Note the dramatic (twofold) area reduction compared to a single-layer design.



Fig. 2: (a) Inductors cascaded to form  $\lambda/2$  T-line, (b) L and Z<sub>0</sub> plots, (c) Loss and Area plots.

## C. Design Refinements

The performance of the proposed phase shifter is boosted by three new concepts. First, the phase uniformity along the T-line can be improved by tapping points *within* the inductors [Fig. 3(a)]. Infact, to correct for other non-uniformities that follow (see below), we "predistort" the tap positions as well. The exact positions are determined by EMX simulations. The second concept relates to the T-line Z<sub>0</sub> and its trade-off with the phase resolution. For example, if we tap the line with 21 common source transistors to obtain a resolution of 10°, then  $Z_0$  falls to 210  $\Omega$ . This is resolved through the use of phase interpolation by a factor of 4 [Fig. 3(b)]. Here,  $\alpha$  and  $\beta$  assume values of zero, 0.5, 0.75, and 1. Thus,  $Z_0$  drops to only 330  $\Omega$ . While offering fine phase resolution, the interpolating G<sub>m</sub> stages in Fig. 3(b) do introduce capacitive feedthrough when they are off. The resulting phase non-uniformity is corrected by the foregoing predistortion method. The third concept deals with the loss of the T-line and its dependence on the tap

positions. We correct for this effect in the stages preceding the phase shifter, as explained below.



Fig. 3: (a) Stacked inductor with phase shift taps, (b) T-line with interpolation.

#### **III. RECEIVER ARCHITECTURE**

Figure 4 shows the details of one beamforming element's front end. A cascode low-noise amplifier (LNA) followed by a G<sub>m</sub> stage drives the differential T-lines. The delayed output currents  $I_{out+}$  and  $I_{out-}$ , flow through two sets of cascode devices that, under the command of  $V_1$  and  $V_2$ , can swap the currents and hence impart another 180° of phase shift. The outputs of four elements are combined at the drains of these cascodes and subsequently travel to passive I/Q mixers. The front end shown in Fig. 4 merits two remarks. First, to compensate for the loss of the T-lines we introduce a negative resistance between nodes X and Y by means of a cross-coupled pair. The negative resistance boosts the gain by 5.7 dB at the cost of a 2.7 dB reduction in the output-1-dB compression point. Second, to account for the position dependence of the loss, we allow 3.2 dB of gain programmability within the LNA. That is, a receive element requiring a greater phase shift sets the LNA gain to a higher value. Shown in Fig. 5 is the overall architecture. After combining in the RF domain, the outputs of the top four elements are downconverted to BB<sub>LT</sub> and  $BB_{0T}$ . Similarly, the bottom four elements produce  $BB_{IB}$ and  $BB_{OB}$ . These final baseband signals are not combined to allow flexibility in test and characterization.

The LO is generated by the 56-GHz synthesizer shown in Fig. 5. In contrast to prior direct-conversion examples at these frequencies, we generate the I and Q phases of the LO by means of a  $\div$ 2 circuit, thereby avoiding the loss of passive phase splitters and hence saving power. The divider draws only 4.7 mW.

## IV. PROBLEM OF INTER-ELEMENT COUPLING

A critical issue in highly-integrated beamforming receivers stems from unwanted coupling between adjacent elements, a phenomena evidently not recognized in the prior art. This effect proves serious as the signals received by the adjacent elements can have a phase difference of anywhere between zero and  $180^{\circ}$ . As an example, consider the two elements shown in Fig. 6, which are separated by  $236 \,\mu\text{m}$ . We drive the two by RF inputs having a zero or  $180^{\circ}$  phase difference.



Fig. 4: Front-end realization of one element.



Fig. 5: Overall architecture.

Without the guard rings around the inductors and around the elements, the gain changes by 2.1 dB, and the phase shift from the LNA input to its output changes by  $17.7^{\circ}$ . With the gaurd rings, on the other hand, the gain and the phase change by only 0.8 dB and 5°, respectively.



Fig. 6: Guard rings around inductors, T-line, and elements to improve isolation.

# V. LO LEAKAGE CANCELLATION

Another concept proposed here relates to the carrier leakage to the RF path, a severe issue at 28 GHz leading to possible desensitization and/or large dc offsets in the baseband. We present in Fig. 7 an active LO cancellation method that reduces the dc offsets. The I and Q phases of the LO are scaled, interpolated, and injected into nodes A and B [Fig. 5] by means of differential amplifiers with programable tail current sources. The optimum settings are determined by observing and minimizing the offsets in each of the baseband outputs. The noise added to the signal path by the LO injectors raises the overall noise figure by only 0.02 dB.



Fig. 7: LO leakage cancellation realization.

# VI. EXPERIMENTAL RESULTS

The eight-element RX has been fabricated in TSMC's 28nm technology and tested with a 1-V supply. Fig. 8 shows the photograph of the die, which measures 1.38 mm  $\times$  1.71 mm. The RX draws a total of 156 mW in the low-NF mode and 109 mW in the low-power mode, of which 38 mW are drained by the synthesizer and the LO distribution network and the rest by the eight receive paths.



Fig. 8: Die photograph

Figure 9 plots the measured NF for one element (with the other elements disabled) and also for four elements whose outputs are combined.



Fig. 9: Measured NF.

Figure 10 plots the measured gain,  $S_{11}$ , and  $P_{1dB}$ . Also shown are the measured baseband dc offsets vs. the digital control of the LO leakage cancellation in Fig. 7. As a or b are programmed in a given direction, the offset varies from -151 mV to 148 mV, proving the efficacy of the proposed approach. The baseband I/Q outputs display amplitude and phase mismatches of 0.7 dB and  $8.5^{\circ}$ , respectively.



Fig. 10: (a) Gain, (b) linearity, (c)  $S_{11}$ , and (d) dc offset vs. digital control of LO leakage cancellation network.

Figure 11 plots the phase shift characteristics at 28 GHz and also as a function of frequency. The average phase resolution is 11.7° and the total reaches 199° (excluding the 180° swap available by  $V_1$  and  $V_2$  in Fig. 4). Figure 12 shows the measured phase noise of the synthesizer after frequency division to 7 GHz. The integrated jitter from 1 kHz to 40 MHz is 155 fs. Table 1 summarizes the performance. The power is calculated for eight elements. For a fair comparison to [1], we operate our RX in the low-power mode (NF=6.8 dB), drawing only 94 mW. Also, compared to [7], our design draws about 1.9x less power and achieves twice the phase shift resolution.

## REFERENCES

- S. Mondal, R. Singh, A. I. Hussein and J. Paramesh, "A 25–30 GHz Fully-Connected Hybrid Beamforming Receiver for MIMO Communication," *IEEE J. Solid-State Circuits*, vol. 53, no. 5, pp. 1275-1287, May 2018.
- [2] J. D. Dunworth et al., "A 28GHz Bulk-CMOS dual-polarization phasedarray transceiver with 24 channels for 5G user and basestation equipment," *IEEE ISSCC Dig. Tech. Papers*, Feb. 2018, pp. 70-72.
- [3] B. Sadhu et al., "A 28-GHz 32-Element TRX Phased-Array IC With Concurrent Dual-Polarized Operation and Orthogonal Phase and Gain Control for 5G Communications," *IEEE J. Solid-State Circuits*, vol. 52, no. 12, pp. 3373-3391, Dec. 2017.
- [4] H. -T. Kim et al., "A 28-GHz CMOS Direct Conversion Transceiver With Packaged 2 × 4 Antenna Array for 5G Cellular System," *IEEE J. Solid-State Circuits*, vol. 53, no. 5, pp. 1245-1259, May 2018.
- Solid-State Circuits, vol. 53, no. 5, pp. 1245-1259, May 2018.
  [5] D. -W. Kang and S. Hong, "A 4-bit CMOS Phase Shifter Using Distributed Active Switches," *IEEE Trans. Microwave Theory and Techniques*, vol. 55, no. 7, pp. 1476-1483, July 2007.
- [6] A. Zolfaghari, A. Chan and B. Razavi, "Stacked inductors and transformers in CMOS technology," *IEEE J. of Solid-State Circuits*, vol. 36, no. 4, pp. 620-628, April 2001.
- [7] Y. Yoon et al., "A Highly Linear 28GHz 16-Element Phased-Array Receiver with Wide Gain Control for 5G NR Application," *Proc. IEEE RFIC Symposium*, June 2019, pp. 287-290.



Fig. 11: (a) Phase shift characteristics at 28 GHz, and (b) as a function of frequency.



Fig. 12: Synthesizer phase noise after  $\div 8$ .

|                                    | This Work           |                     | 141                     |                    |                     |
|------------------------------------|---------------------|---------------------|-------------------------|--------------------|---------------------|
|                                    | Low NF              | Low Power           | [1]                     | [4]                | [/]                 |
| Input Frequency<br>(GHz)           | 27.3 – 29           | 27.5 – 29.2         | 25 – 30                 | 25.8 – 28          | 26.5 – 29.5         |
| No. of Rx elements                 | 8                   |                     | 8                       | 8                  | 16                  |
| NF <sub>min</sub> (dB)             | 3.7                 | 6.8                 | 7.3                     | 6.7                | 3.5                 |
| Gain (dB)                          | 31.3<br>(1 element) | 23.4<br>(1 element) | 34<br>(1 element)       | 69<br>(8 elements) | 60<br>(16 elements) |
| IP <sub>1dB</sub> (dBm)            | -39<br>(1 element)  | -31<br>(1 element)  | -29ª / -21 <sup>b</sup> | -68.9ª / -34.8b    | -                   |
| On-Chip LO Synth.                  | Yes                 |                     | No                      | No                 | No                  |
| Total Power<br>(excl. synth. )(mW) | 141*                | 94*                 | 223*                    | 400*               | 270*                |
| Total Power<br>(incl. synth. )(mW) | 156*                | 109*                | N.A.                    | N.A.               | N.A.                |
| Phase Resolution                   | 11.7°               |                     | N.A.                    | 45°                | 22.5°               |
| Technology                         | 28-nm CMOS          |                     | 65-nm CMOS              | 28-nm CMOS         | 28-nm CMOS          |
| *For 8 elemente <sup>8</sup> Hi    | ah-aain mode        | bl ow-gain m        | ode                     |                    |                     |

Table 1: Performance summary and comparison.