# High-Frequency Synthesis using Phase-Locked Loops for Wide Tuning-Range Applications and Sub-1 V Operation in Deep Submicron CMOS Processes

Omar Abdelfattah



Department of Electrical and Computer Engineering McGill University Montreal, Canada Dec 2015

A thesis submitted to McGill University in partial fulfilment of the requirements for the degree of Doctor of Philosophy.

©2015 Omar Abdelfattah

# Abstract

Frequency synthesizers based on phase-locked loop (PLL) are ubiquitous components in RF communication systems. Frequency synthesizer PLLs must comply with the stringent requirements of RF systems such as noise, linearity, locking time, stability, and power consumption. The continuous shrinkage of the technology dimensions and power supply values exacerbated the situation and made the design more daunting especially at high frequencies. Integrability and long-life batteries have become extremely important targets in modern life. The ability to incorporate multiple standards in one device has recently stimulated a great deal of interest and brought to existence applications such as software-defined radio (SDR) and cognitive radio (CR). Such applications require very wide tuning range frequency synthesizers to cover multiple standards. The ability to cover this wide range with a single frequency synthesizer PLL is very desirable in terms of cost, area, and power.

In this thesis, we tackle high frequency synthesis in light of the challenges imposed by modern CMOS technologies. More specifically, we tackle two design challenges. The first challenge is the need for wide tuning-range frequency synthesizer PLLs; and the second challenge is the need for analog circuits, including frequency synthesizer PLLs, that can operate from supply voltages below 0.6 V as predicted by semiconductor roadmaps for the next decade. In response to these technology demands, we provide three different IC implementations with measurement results to verify the theoretical findings. We demonstrate two frequency synthesizer PLLs in 65 nm CMOS technology. The first PLL focuses on wide tuning-range for applications such as SDR and CR, while operating from a supply voltage as low as 1.2 V. A continuous frequency range from 156.25 MHz to 10 GHz is achieved using a single frequency synthesizer PLL. The second PLL focuses on sub-1 V operation to generate a low-noise output. This PLL operates from a 0.55 V power supply and consumes 3 mW of power. The designed PLLs show comparable performance with the state-of-the-art PLLs in the literature in CMOS and other technologies. Furthermore, a third IC implementation of an ultra-low-voltage operational-transconductanceamplifier (OTA) is presented. The OTA combines different low-voltage techniques along with a novel biasing technique that allows operation from a supply voltage as low as 0.35 V. The ultra-low-voltage OTA can be used as a building block for the design of other biasing circuitry at low voltage such as bandgap references and voltage regulators.

#### Résumé

Les synthétiseurs de fréquence utilisant une boucle à phase asservie (PLL) sont omniprésents dans les composants des systèmes de communication radio- fréquence (RF). Les synthétiseurs de fréquence doivent se conformer aux exigences rigoureuses des systèmes RF tels que le bruit, la linéarité, le temps de verrouillage, la stabilité et la consommation d'énergie. La réduction continue des dimensions de la technologie et des tensions d'alimentation ont exacerbés les difficultés de conception surtout pour les hautes fréquences. Intégrabilité et longue durée de vie des batteries sont aussi devenues des objectifs extrêmement importants dans la vie moderne. La capacité d'incorporer plusieurs standards dans un seul appareil a récemment suscite un grand intérêt dans des applications telles que la radio logicielle (SDR) et la radio cognitive (CR). Ces applications nécessitent des synthétiseurs de fréquence avec une large plage de syntonisation pour couvrir plusieurs standards. La capacité à couvrir cette gamme avec un seul PLL est très souhaitable en termes de coût, de taille de puce et de consommation d'énergie.

Dans cette thèse, nous nous attaquons à deux défis de conception concernant la synthèse haute fréquence. Le premier est la conception de PLLs avec de larges plages de syntonisation. Le deuxième est lié au besoin de circuits analogiques, y compris les PLLs, fonctionnant avec des alimentations en dessous de 0.6 V. Ceci conformément aux prédiction des semiconductor roadmaps pour la prochaine décennie. En réponse a ces défis, nous fournissons trois implémentations différentes avec mesures expérimentales vérifiant les résultats théoriques obtenus. Nous décrivons deux PLLs pour synthétiseurs de fréquence utilisant une technologie CMOS 65nm. Le premier PLL se concentre sur la large plage de syntonisation pour des applications telles que SDR et CR, tout en fonctionnant sous une tension aussi basse que 1,2 V. Une gamme de fréquence continue de 156,25 MHz à 10 GHz est réalisée en utilisant un seul PLL. Le deuxième PLL se concentre sur les opération sub-1 V pour générer un signal à faible bruit. Ce PLL fonctionne sous une alimentation de 0,55 V et consomme 3 mW de puissance. Les PLLs conçus présentent des performances comparables aux PLLs dans la littérature la plus récente utilisants le CMOS et d'autres technologies. En outre, un troisième implémentation en tension-ultrabasse d'un amplificateur opérational de transconductance (OTA) est présentée. L'OTA combine différentes techniques de basse-tension avec une nouvelle technique de polarisation qui permet le fonctionnement sous une tension d'alimentation aussi basse que 0.35V. L'OTA à tension-ultra-basse peut être utilisé comme un composant de base pour la conception d'autres circuits à basse-tension tels que les références de gap d'énergie et les régulateurs de voltage.

### Acknowledgements

First of all, I am grateful to God for giving me the health, the well-being, and the guidance throughout my life to make this work possible. Next, I would like to express my gratitude to all the people who contributed to this work by all means. Special thanks go to my supervisors Ishiang Shih and Gordon Roberts for providing me with continuous support throughout my PhD work. I would also like to thank my friends and colleagues in the Integrated Microsystem Laboratory (IML) for the great moments we spent together and the support they provided in moments of confusion and distress. I would like to give special thanks to my friend George Gal who helped in the design and the test of some parts of this work, and always gave me great insight to solve issues related to this work. My thanks go also to my parents, as well as my sister for her relentless follow-up of my life and whereabouts despite the distance and despite life goings-on.

# Contents

| 1        | Intr | oducti  | on                                           | 1  |
|----------|------|---------|----------------------------------------------|----|
|          | 1.1  | Motiva  | ation                                        | 6  |
|          |      | 1.1.1   | Wide Tuning-Range Frequency Synthesis        | 7  |
|          |      | 1.1.2   | Sub-1 V Operation                            | 9  |
|          | 1.2  | Prima   | ry Contribution                              | 9  |
|          |      | 1.2.1   | Wide Tuning-Range Frequency Synthesizer PLLs | 9  |
|          |      | 1.2.2   | Ultra-Low-Voltage Frequency Synthesizer PLLs | 10 |
|          |      | 1.2.3   | Sub-1 V Peripheral Circuits                  | 11 |
|          |      | 1.2.4   | MOS Transistor Modeling                      | 12 |
|          | 1.3  | Thesis  | Overview                                     | 12 |
| <b>2</b> | PLI  | L Desig | gn Using Top-Down Approach                   | 15 |
|          | 2.1  | PLL fi  | rom a System View                            | 15 |
|          |      | 2.1.1   | PLL Components                               | 18 |
|          |      | 2.1.2   | Frequency Response of the PLL                | 21 |
|          | 2.2  | PLL N   | Noise                                        | 24 |
|          |      | 2.2.1   | Output Spectrum Noise Representation         | 25 |
|          |      | 2.2.2   | Phase Noise and Timing Jitter                | 27 |
|          |      | 2.2.3   | Noise Representation in PLL Linearized Model | 28 |
|          |      | 2.2.4   | Noise Optimization in PLL Design             | 30 |
|          | 2.3  | System  | n-Level Design                               | 33 |
|          |      | 2.3.1   | PLL Design Specifications                    | 33 |
|          |      | 2.3.2   | PLL Behavioral Modeling                      | 36 |
|          | 2.4  | Transi  | stor-Level Design                            | 40 |
|          |      | 2.4.1   | Evolution of MOS Transistor Modeling         | 41 |
|          |      | 2.4.2   | The MOS Transistor Normalized Parameters     | 43 |

|   | 2.5 | Summ    | ary 46                                                                                                                            |
|---|-----|---------|-----------------------------------------------------------------------------------------------------------------------------------|
| 3 | PLI | L Build | ling Blocks: Circuit Design and Behavioral Models 47                                                                              |
|   | 3.1 | Voltag  | ge-Controlled Oscillator                                                                                                          |
|   |     | 3.1.1   | VCO Types                                                                                                                         |
|   |     | 3.1.2   | Phase Noise                                                                                                                       |
|   |     | 3.1.3   | Behavioral Modeling in Verilog-A                                                                                                  |
|   | 3.2 | Phase   | -Frequency Detector                                                                                                               |
|   |     | 3.2.1   | Non-idealities and limitations                                                                                                    |
|   | 3.3 | Charg   | e Pump                                                                                                                            |
|   |     | 3.3.1   | Non-idealities and limitations                                                                                                    |
|   |     | 3.3.2   | Circuit implementations                                                                                                           |
|   |     | 3.3.3   | Behavioral Modeling in Verilog-A                                                                                                  |
|   | 3.4 | Freque  | ency Dividers                                                                                                                     |
|   |     | 3.4.1   | Prescalar                                                                                                                         |
|   |     | 3.4.2   | Programmable Divider                                                                                                              |
|   |     | 3.4.3   | Behavioral Modeling in Verilog-A                                                                                                  |
|   | 3.5 | Loop 2  | Filter                                                                                                                            |
|   |     | 3.5.1   | Passive Filter Design                                                                                                             |
|   |     | 3.5.2   | Active Filter Design                                                                                                              |
|   | 3.6 | Summ    | ary                                                                                                                               |
| 4 | Тор | -Down   | Design Including Loop Variations in Wide-Range PLLs 102                                                                           |
|   | 4.1 | Design  | $1 Methodology \dots \dots$ |
|   | 4.2 | PLL F   | Building Blocks for Wide-Range Operation                                                                                          |
|   |     | 4.2.1   | Wide Tuning-Range VCO                                                                                                             |
|   |     | 4.2.2   | Reference Oscillator                                                                                                              |
|   |     | 4.2.3   | Input Buffer                                                                                                                      |
|   |     | 4.2.4   | Phase-Frequency Detector                                                                                                          |
|   |     | 4.2.5   | Charge Pump                                                                                                                       |
|   |     | 4.2.6   | Frequency Dividers                                                                                                                |
|   |     | 4.2.7   | Loop Filter                                                                                                                       |
|   |     | 4.2.8   | Frequency Calibration                                                                                                             |
|   | 4.3 | Desigr  | 137                                                                                                                               |
|   | 4.4 |         | rement Results                                                                                                                    |
|   | 4.5 | Summ    | ary 149                                                                                                                           |

| <b>5</b>     | Fre            | quency  | Synthesizer PLL for Ultra-Low-Voltage Operation | 150   |
|--------------|----------------|---------|-------------------------------------------------|-------|
|              | 5.1            | Circui  | it Design of PLL Components                     | . 150 |
|              |                | 5.1.1   | Ultra-Low-Voltage VCO                           | . 151 |
|              |                | 5.1.2   | PFD/CP                                          | . 153 |
|              |                | 5.1.3   | Frequency Dividers                              | . 156 |
|              |                | 5.1.4   | Loop Filter                                     | . 157 |
|              | 5.2            | Noise   | Contribution                                    | . 158 |
|              | 5.3            | Exper   | imental Measurements                            | . 159 |
|              | 5.4            | Summ    | nary                                            | . 161 |
| 6            | $\mathbf{Per}$ | iphera  | l Circuits for Sub-1 V Operation                | 163   |
|              | 6.1            | Ultra-  | low-Voltage Op-Amps                             | . 163 |
|              |                | 6.1.1   | Design of Input-Stage                           | . 165 |
|              |                | 6.1.2   | The Proposed Biasing Technique                  | . 169 |
|              |                | 6.1.3   | Frequency Compensation                          | . 178 |
|              |                | 6.1.4   | OTA Design and Analysis                         | . 180 |
|              |                | 6.1.5   | Measurement Results                             | . 184 |
|              | 6.2            | Ultra-  | low Voltage Bandgap Reference                   | . 190 |
|              |                | 6.2.1   | BGR Fundamentals                                | . 192 |
|              |                | 6.2.2   | The Proposed BGR                                | . 193 |
|              |                | 6.2.3   | The Proposed BGR                                | . 196 |
|              |                | 6.2.4   | Simulation Results                              | . 198 |
|              | 6.3            | LDO     | Voltage Regulator                               | . 198 |
|              |                | 6.3.1   | Fundamentals of LDO voltage regulators          | . 199 |
|              |                | 6.3.2   | Simulation Results                              | . 203 |
|              | 6.4            | Summ    | nary                                            | . 205 |
| 7            | Cor            | nclusio | n and Future Work                               | 207   |
|              | 7.1            | Summ    | nary and Conclusion                             | . 207 |
|              | 7.2            | Future  | e Work                                          | . 209 |
| A            | LC-            | -VCO    | Design Parameters                               | 211   |
| В            | Cal            | culatio | on of Phase Noise from Time Periods             | 213   |
| $\mathbf{C}$ | Ext            | ractio  | n of MOS Transistors Subthreshold Parameters    | 215   |

# List of Tables

| 3.1 | Reference spurs attenuation and locking time for different RC loop filters                | 93  |
|-----|-------------------------------------------------------------------------------------------|-----|
| 3.2 | Spurs attenuation and locking time for $3^{\rm rd}$ order Butterworth filter              | 96  |
| 4.1 | Normalized parameters for $V_{GS} = V_{DS} = 0.5 \text{ V} \dots \dots \dots \dots \dots$ | 112 |
| 4.2 | Comparison between wide tuning range VCO designs                                          | 116 |
| 4.3 | Comparison between wide-range frequency synthesizer PLLs $\ldots$ .                       | 148 |
| 5.1 | Comparison between ultra-low voltage PLLs in the literature                               | 162 |
| 6.1 | Transistors dimensions and parameters for $V_{DD} = 0.5 \text{ V} \dots \dots \dots$      | 183 |
| 6.2 | Comparison between simulated and measured performance of the op-amp                       | 187 |
| 6.3 | Comparison of OTA performance with reported ulra-low voltage OTAs $$ .                    | 191 |
| 6.4 | Transistors dimensions of the OTA used in the BGR design                                  | 197 |
| 6.5 | Performance comparison between sub-1V BGR designs in the literature $% \mathcal{A}$ .     | 200 |
|     |                                                                                           |     |

# List of Figures

| 1.1  | Transistor channel-length in CMOS processes over the past two decades .              | 1  |
|------|--------------------------------------------------------------------------------------|----|
| 1.2  | Predicted supply voltage $V_{DD}$ over the next decade $\ldots \ldots \ldots \ldots$ | 2  |
| 1.3  | Supply-voltage and threshold-voltage scaling over the past decade $\ldots$ .         | 3  |
| 1.4  | Solar cell structure                                                                 | 4  |
| 1.5  | General architecture of multi-band receiver                                          | 5  |
| 1.6  | Concept of a universal single-transceiver device                                     | 5  |
| 1.7  | Direct realization of software-defined radio                                         | 6  |
| 1.8  | Front-end IF receiver for SDR application                                            | 7  |
| 1.9  | An ultra-low voltage SoC with on-chip PLLs                                           | 8  |
| 2.1  | A block diagram of a conceptual frequency synthesizer                                | 15 |
| 2.2  | A general architecture of a heterodyne receiver                                      | 16 |
| 2.3  | A block diagram of a basic frequency synthesizer PLL                                 | 17 |
| 2.4  | A block diagram of the PFD/CP                                                        | 20 |
| 2.5  | Loop filter examples: (a) first-order (b) second-order                               | 21 |
| 2.6  | Linearized model of the frequency synthesizer PLL $\ldots$                           | 22 |
| 2.7  | The magnitude and phase of the loop-gain of the third-order PLL $\ . \ . \ .$        | 23 |
| 2.8  | A typical frequency spectrum of a PLL output signal                                  | 25 |
| 2.9  | Conversion of amplitude noise into timing jitter                                     | 27 |
| 2.10 | Different noise sources in a PLL linearized model                                    | 29 |
| 2.11 | Noise sources in a PLL linearized model combined into two main sources               | 30 |
| 2.12 | Contribution of noise components at the output of the PLL                            | 30 |
| 2.13 | Selection of optimum $f_c$ to minimize total PLL jitter                              | 32 |
| 2.14 | Effect of phase noise on the output in the presence of large interferers             | 34 |
| 2.15 | Examples of generated normalized parameters at $L = L_{min}$                         | 44 |
| 2.16 | Transistor model                                                                     | 44 |
| 2.17 | The MOS transistor characterization tool                                             | 45 |

| 3.1  | The concept of multivibrator oscillator                                           | 48 |
|------|-----------------------------------------------------------------------------------|----|
| 3.2  | A Schmitt trigger multivibrator oscillator                                        | 49 |
| 3.3  | A ring oscillator: (a) architecture (b) single-ended stage inverter               | 50 |
| 3.4  | The concept of LC oscillators                                                     | 51 |
| 3.5  | (a) Complementary CMOS LC-VCO and (b) equivalent small-signal model               | 52 |
| 3.6  | A typical phase noise profile of an oscillator                                    | 54 |
| 3.7  | LC oscillator: (a) excitation with a current impulse (b) impulse response         | 56 |
| 3.8  | Typical ISF of (a) LC oscillator (b) ring oscillator                              | 58 |
| 3.9  | Oscillator as a cascade of two systems                                            | 59 |
| 3.10 | Conversion of flicker and thermal noise into phase noise                          | 60 |
| 3.11 | Generation of VCO signal using the <i>idtmod</i> function                         | 61 |
| 3.12 | Output of the <i>idtmod</i> function                                              | 62 |
| 3.13 | A typical circuit implementation of a tri-state PFD followed with a CP $$ .       | 66 |
| 3.14 | State machine representation of the PFD                                           | 67 |
| 3.15 | PFD output in response to input phase difference                                  | 67 |
| 3.16 | Ideal phase characteristics of a PFD/CP                                           | 68 |
| 3.17 | PFD response to small phase difference (a) without delay (b) with delay           | 69 |
| 3.18 | Phase characteristics of a PFD/CP in the presence of a dead zone $\ . \ . \ .$    | 69 |
| 3.19 | Missing edge due to blind zone in PFD                                             | 70 |
| 3.20 | Phase characteristics of a PFD/CP in the presence of a blind zone                 | 71 |
| 3.21 | A general architecture of CP                                                      | 71 |
| 3.22 | Output current of a CP versus the output voltage                                  | 73 |
| 3.23 | CP topologies with switches at (a) gates (b) drains (c) sources                   | 75 |
| 3.24 | CP circuit implementation with current steering switches                          | 76 |
| 3.25 | A conventional circuit implementation of injection-locked divider (ILFD)          | 79 |
| 3.26 | The concept of operation of an RFD                                                | 80 |
| 3.27 | A circuit implementation of an RFD                                                | 81 |
| 3.28 | A circuit implementation of a CML-based divider                                   | 82 |
| 3.29 | A $2/3$ divider cell                                                              | 83 |
| 3.30 | A conventional architecture of dual-modulus programmable divider $\ldots$         | 83 |
| 3.31 | A dual-modulus programmable divider with extended division range                  | 84 |
| 3.32 | An architecture of a divide-by-N programmable divider                             | 84 |
| 3.33 | Circuit diagram of EOC detector (a) conventional (b) by Chang $et \ al$           | 85 |
| 3.34 | Strobed noise at the threshold-crossing points of signal $v_n(t)$                 | 86 |
| 3.35 | RC network loop filters: (a) $1^{st}$ order (b) $2^{nd}$ order (c) $3^{rd}$ order | 88 |

| 3.36 | Loop gain response of a third-order PLL                                                                                                                                     | 89  |
|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
| 3.37 | Maximum phase margin vs. ratio $a_1 \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots$                                                                         | 90  |
| 3.38 | The relationship between the capacitor ratios in a 3 <sup>rd</sup> -order RC loop filter                                                                                    | 92  |
| 3.39 | Structure and response of an $n^{th}$ -order LC-ladder loop filter $\ldots \ldots \ldots$                                                                                   | 94  |
| 3.40 | Effect of lowering the zero-frequency on loop-gain attenuation                                                                                                              | 96  |
| 3.41 | An $n^{th}$ order active loop filter $\ldots \ldots \ldots$ | 98  |
| 3.42 | Second-order differential active loop filter                                                                                                                                | 99  |
| 4.1  | Flow chart of design methodology of wide-range PLL                                                                                                                          | 104 |
| 4.2  | Schematic view of the PLL frequency synthesizer                                                                                                                             | 105 |
| 4.3  | (a) Complementary CMOS LC-VCO and (b) equivalent small-signal model                                                                                                         | 106 |
| 4.4  |                                                                                                                                                                             | 109 |
| 4.5  | $LQ_L$ and $g_L$ vs. inductor value                                                                                                                                         | 109 |
| 4.6  | A cross section of an accumulation-mode varactor and its equivalent model                                                                                                   | 110 |
| 4.7  | Minimum $Q_v$ and $C_{v,max}/C_{v,min}$ vs $L$                                                                                                                              | 111 |
| 4.8  | Equivalent input conductance of a cross-coupled pair                                                                                                                        | 111 |
| 4.9  | Design procedure to optimize the VCO tuning range                                                                                                                           | 113 |
| 4.10 | Tuning-range and power vs. inductor value for different VCOs                                                                                                                | 114 |
| 4.11 | Carrier frequency versus control voltage of the VCO                                                                                                                         | 115 |
| 4.12 | Micrograph of the fabricated wide tuning-range VCO                                                                                                                          | 115 |
| 4.13 | Schematic of the VCO bank: Two VCOs with selector buffers                                                                                                                   | 117 |
| 4.14 | Block diagram of the wide-range VCO modeling procedure                                                                                                                      | 119 |
| 4.15 | The frequency bands of the VCO with respect to $V_{CTRL}$                                                                                                                   | 120 |
| 4.16 | Variations in (a) VCO gain and (b) cycle jitter with respect to frequency                                                                                                   | 121 |
| 4.17 | Measured phase noise of the reference oscillator                                                                                                                            | 123 |
|      | 1 I                                                                                                                                                                         | 124 |
| 4.19 | Edge-to-edge jitter of the input-buffer                                                                                                                                     | 125 |
| 4.20 | Circuit diagram of a dynamic-logic PFD                                                                                                                                      | 127 |
| 4.21 | Circuit diagram of the modified dynamic-logic PFD                                                                                                                           | 127 |
|      |                                                                                                                                                                             | 128 |
| 4.23 | Output CP current vs. output voltage                                                                                                                                        | 128 |
| 4.24 | Output current noise of the CP versus frequency                                                                                                                             | 130 |
| 4.25 | Six-bit programmable counter schematic                                                                                                                                      | 132 |
| 4.26 | Control-logic circuit of the programmable counter                                                                                                                           | 132 |
| 4.27 | CML-based prescalar                                                                                                                                                         | 133 |

| 4.28 | Edge-to-edge jitter of the frequency dividers                                 | 134 |
|------|-------------------------------------------------------------------------------|-----|
| 4.29 | Loop filter noise                                                             | 136 |
| 4.30 | Frequency calibration circuit for (a) $V_H$ and (b) $V_L$                     | 137 |
| 4.31 | Schmitt trigger comparators                                                   | 138 |
| 4.32 | Simulated variations in PLL parameters                                        | 139 |
| 4.33 | Simulated noise contribution from different PLL components                    | 140 |
| 4.34 | A microphotograph of the fabricated frequency synthesizer PLL $\ . \ . \ .$   | 142 |
| 4.35 | Measurement test-bench for high-frequency probing                             | 143 |
| 4.36 | Simulated and measured phase noise for 8 GHz carrier frequency. $\ . \ . \ .$ | 143 |
| 4.37 | Simulated and measured phase noise vs. carrier frequency                      | 144 |
| 4.38 | Measured phase noise vs. carrier frequency                                    | 146 |
| 4.39 | Measured integrated rms phase error in seconds vs. carrier frequency          | 147 |
| 4.40 | Measured reference spurs attenuation in dBc vs. carrier frequency             | 147 |
| 5.1  | A general architecture of an integer-N PLL                                    | 151 |
| 5.2  | Low-voltage LC-VCO schematic                                                  | 152 |
| 5.3  | Simulated phase noise of the low-voltage VCO for 1 GHz carrier frequency      | 152 |
| 5.4  | PFD architecture for low-voltage PLL                                          | 153 |
| 5.5  | Low-voltage CP schematic                                                      | 154 |
| 5.6  | Mismatch between up and down currents of the CP in Fig. 5.5                   | 154 |
| 5.7  | Output current noise of the low-voltage PFD/CP                                | 155 |
| 5.8  | Prescalar architecture and schematic                                          | 156 |
| 5.9  | Structure of the used 6-bit programmable counter                              | 156 |
| 5.10 | Variance of the dividers noise amplitude $n_v(t)$ versus frequency            | 157 |
| 5.11 | Third-order loop filter                                                       | 157 |
| 5.12 | PLL noise components                                                          | 158 |
| 5.13 | Micrograph of the fabricated chip                                             | 159 |
| 5.14 | Measured output spectrum of the PLL at 1.2 GHz                                | 160 |
| 5.15 | Phase noise of the PLL at 1.2 GHz                                             | 160 |
| 6.1  | A conventional LDO voltage regulator                                          | 164 |
| 6.2  | Rail-to-rail input stage using complementary differential pairs               | 165 |
| 6.3  | Pseudo differential pair (a) without CMFF (b) with CMFF                       | 167 |
| 6.4  | Bulk-driven PMOS transistor: (a) circuit operation (b) cross section          | 167 |
| 6.5  | (a) Circuit schematic (b) Block diagram representation of proposed OTA        | 168 |
| 6.6  | ac representation of OTA with (a) separate biasing (b) proposed biasing .     | 170 |

| 6.7  | Implementation of the proposed biasing technique on the three-stage OTA.                  | 171 |
|------|-------------------------------------------------------------------------------------------|-----|
| 6.8  | Equivalent circuit for common-mode (a) separate biasing (b) self-biasing                  | 172 |
| 6.9  | Improvement in CMRR due to self-biasing technique                                         | 174 |
| 6.10 | Equivalent circuit under supply noise (a) separate biasing (b) self-biasing               | 175 |
| 6.11 | Small-signal model of the circuit in Fig. 6.10(b)                                         | 176 |
| 6.12 | Improvement in PSRR due to self-biasing technique                                         | 177 |
| 6.13 | DC gain and output DC voltage $W_1$ for self-biasing and separate biasing                 | 179 |
| 6.14 | Output DC voltage and DC gain under process variations                                    | 179 |
| 6.15 | A block diagram of the proposed OTA with frequency compensation                           | 180 |
| 6.16 | Full schematic of the proposed OTA with self-biasing and compensation .                   | 181 |
| 6.17 | Small-signal equivalent model of the OTA                                                  | 181 |
| 6.18 | Chip microphotograph of the fabricated OTA                                                | 184 |
| 6.19 | Measured open-loop frequency-response at $V_{DD} = 0.5 \text{ V} \dots \dots \dots \dots$ | 185 |
| 6.20 | Measured DC gain in dB vs. output swing at $V_{DD} = 0.5$ V $\ldots$ .                    | 186 |
| 6.21 | Measured DC gain vs. supply-voltage $V_{DD}$                                              | 186 |
| 6.22 | Rail-to-rail input/output waveforms for (a) $V_{DD} = 0.5$ V (b) $V_{DD} = 0.35$ V        | 188 |
| 6.23 | Input/output for $V_{DD} = 0.5$ V with (a) $V_{CM} = 0.05$ V (b) $V_{CM} = 0.45$ V        | 189 |
| 6.24 | The concept of bandgap reference                                                          | 192 |
| 6.25 | Low voltage BGR proposed by Banba <i>et al</i>                                            | 194 |
| 6.26 | The Proposed BGR                                                                          | 194 |
| 6.27 | The low-voltage BGR proposed by Ytterdal                                                  | 196 |
| 6.28 | Simulated reference voltage versus temperature                                            | 198 |
| 6.29 | Simulated reference voltage versus supply voltage                                         | 199 |
| 6.30 | Simulated PSRR of the BGR                                                                 | 199 |
| 6.31 | Frequency response of LDO regulator for minimum and maximum $I_L$                         | 201 |
| 6.32 | Simulated $V_{OUT}$ vs. $V_{DD}$                                                          | 202 |
| 6.33 | Simulated $V_{OUT}$ vs. $I_L$                                                             | 204 |
| 6.34 | Simulated PSRR of the LDO voltage regulator                                               | 204 |
| 6.35 | Simulated transient response of the regulator                                             | 205 |
| A.1  | Simulation set-up to evaluate LC VCO design parameters                                    | 212 |
| C.1  | Simulation set-up to extract PMOS transistor subthreshold parameters .                    | 216 |

# Chapter 1

# Introduction

RF communication today is a multibillion industry. The rapid advance in CMOS technology had a great impact on modern RF systems. Whether it is improvement in the transceiver architecture, the transistor-level circuit design, or energy saving; circuit designers continue to find innovative ways to maximize the benefits as technology advances. The most common variable feature of CMOS technology is the transistor channel-length which continues to scale down in newer technologies following Moore's law prediction. Fig. 1.1 shows the transistor channel-length over the past two decades in standard CMOS technologies.

The continuous downscaling of the transistor channel-length has many advantages especially in digital design as it allows more transistors to be integrated in the same



Figure 1.1: Transistor channel-length in CMOS processes over the past two decades



Figure 1.2: Predicted supply voltage  $V_{DD}$  over the next decade

area. This means more functions per area which has led to reduced cost per product. Furthermore, the reduced channel-length has led to increased transitional frequency  $f_T$  of the transistor [1]. The transitional frequency of today's CMOS technologies is in the range of several hundred GHz which allowed for the design of high frequency circuits in multi-GHz range. High frequency circuits usually require smaller passive devices, i.e. inductors and capacitors. The reduced transistor dimensions along with the reduced size of passive devices have allowed new possibilities and larger scales of integration. While old RF receivers struggled to integrate as many components on the same die, nearly fully-integrated System-on-Chip (SoC) has become a trend in today's RF design [2].

Of course, this downscaling comes with many challenges for circuit designers such as non-ideal behavior of transistors, models complexity and inaccuracy, reduced predictability due to process variations, exacerbated effect of mismatch, and so on. Design techniques at both system-level and transistor-level must continue to evolve and adapt to challenges posed by modern CMOS technologies.

Another trend in CMOS technology is the continuous reduction of the supply voltage. The trend toward smaller channel-length MOS transistors operated by a single low-voltage power-supply has been stimulated by the growing demand of fast digital circuits as well as low power portable devices. Fig. 1.2 shows the supply voltage for high performance high  $V_{DD}$  transistors over the next decade according to the predictions of the most recent semiconductor roadmap [3]. Supply voltage values have scaled down from few volts in older technologies to sub-1 V values in modern deep submicron technologies.



Figure 1.3: Supply-voltage and threshold-voltage scaling over the past decade

The predicted supply voltage is expected to fall below 0.7 V in less than a decade from now. Reducing the supply voltage  $V_{DD}$  reduces both the dynamic power ( $\propto V_{DD}^2$ ) and the leakage power ( $\propto V_{DD}$ ), which leads to extended battery life time. Nevertheless, this reduction in supply voltage poses a great challenge in modern CMOS technologies since it affect many design aspect of analog and RF circuits; more particularly the voltage headroom available at the input and output of the circuit. The problem is exacerbated by the fact that the threshold voltage of the transistor did not scale down at the same rate as the supply voltage to reduce leakage current in digital circuits. Fig. 1.3 shows both supply-voltage and threshold-voltage scaling over the past decade [4]. This imposes a limit on the number of transistors that can be stacked and exacerbates the process variations.

In addition to power saving in digital circuits, some applications require extremely low supply voltages such as biomedical systems [5], hearing-aid devices [6], and solar cells applications [7]. For instance, solar cells, shown in Fig. 1.4, produce a dc voltage of about 0.5-0.6 V, and a dc current  $I_{cell}$  per cell. More often than not, some cells (M) are connected in series to produce sufficient voltage  $(M \times 0.6 \text{ V})$  to operate other devices, while the rest of the cells are connected in parallel to provide sufficient current driving capability  $(N \times I_{cell})$ , as shown in Fig. 1.4. The maximum solar cell current driving efficiency is achieved if all the solar cells are connected in parallel. This requires the devices connected to the solar cell to be able to operate at supply-voltages as low as 0.6



Figure 1.4: Solar cell structure

V.

In addition to advances in technology, market demands play a critical role in today's design trends. Driven by a huge base of consumers, multibillion businesses strive to meet the needs of a wide range of users with innovative and continuously-improved products. Transceivers and wireless communication are in the core of today's applications. Their use is omnipresent in many of these applications such as cellular system, Bluetooth, WiFi, Global Positioning System (GPS), activity and health monitoring, tracking devices, and biomedical applications. Because of the ubiquity of wireless applications, many wireless devices today provide communication using more than one standard, e.g. a cellular phone with WiFi and Bluetooth. This multi-functionality is very common in today's life.

A direct solution to providing multiple standards in one device would be to integrate multiple transceivers on the same chip to reduce cost and area size. However, since many transceivers are similar in architecture the possibility of reconfigurability of the same transceiver to cover multiple bands seems very promising. A great deal of research has been carried out on multi-band transceivers [8]. A general architecture of a multi-band receiver is shown in Fig. 1.5. The receiver components are designed to meet different standards when tuned to operate in different bands.

The idea of a multi-band transceiver can be extended to a generic single transceiver that can cover a wide continuous range of frequencies and accommodate multiple standards. This allows the design of a universal single-transceiver device as conceptually depicted in Fig. 1.6. Using a single transceiver to cover all desired bands allows for



Figure 1.5: General architecture of multi-band receiver



Figure 1.6: Concept of a universal single-transceiver device

greater integrability that can dramatically reduce the cost and the size of the device.

First envisioned by Mitola [9], a software-defined radio provides a universal radio platform that can be configured to receive any modulated signal at any band, channel width, resolution etc over a wide spectrum of frequencies within acceptable specifications. Shown in Fig 1.7, Mitola envisioned that RF signal received by the antenna can



Figure 1.7: Direct realization of software-defined radio

be directly converted into a digital stream that can be processed by means of software. Although conceptually desired due to its programmability and adaptability, the direct conversion of the RF signal into digital is impractical as it places a huge demand on the front-end analog-to-digital converter (ADC) [10]. The high speed and resolution requirements of the ADC result in high power dissipation. In addition, the broadband compatibility results in low receiver sensitivity and allows interference from other adjacent signals, both which reduce the receiver dynamic range and deteriorate the performance. Therefore, an alternative architecture for an RF receiver with intermediate frequency (IF), shown in Fig. 1.8, is often used. The alternative architecture relaxes the requirements on the ADC, and increases the receiver sensitivity and interference tolerance. Consequently, it shifts the demand onto the analog blocks. The proposed IF receiver requires smart antennae, tunable filters, and a local-oscillator (LO). The latter is usually implemented using a wide-range frequency synthesizer PLL which must meet the system specifications.

# 1.1 Motivation

This work is primarily motivated by two main technology demands. The first is the need for wide tuning-range frequency synthesis as a solution that allows hardware reconfigurability through the integration of a multitude of communication standards in a single high-speed low-power chip. The second is the need for analog circuits, including



Figure 1.8: Front-end IF receiver for SDR application

frequency synthesizers, that can operate efficiently with the next generation sub-1 V power-supply voltages expected in the near future.

## 1.1.1 Wide Tuning-Range Frequency Synthesis

Wide-range frequency synthesizer PLLs are highly desirable, yet their design is greatly challenging. The demand for wide tuning-range is more pronounced in applications such as measurement instrumentation, software-defined radio (SDR) [9], ultra-wide band (UWB) receivers [11], and cognitive radio (CR) [12]. Covering a wide frequency range using the minimum number of components is of paramount importance in terms of cost, area and power saving. Furthermore, the frequency synthesizer needs to adhere to the stringent requirements of noise, speed and spurs attenuation for all the standards that are covered by the receiver range, all without compromising area and power consumption.

In addition to the challenge posed by the design of PLL building blocks for wide tuning range, the high-level modeling and prediction of the performance become very arduous due to the large variations in the loop parameters. These variations should be modeled properly and accounted for from the onset of the design. Therefore, it is important to follow a top-down methodology that is tailored for wide-range frequency synthesizer PLL and is able to predict the performance of the loop.

PLL design using top-down approach starting at system-level behavioral models has become a standard procedure today. Many narrow-band PLL designs were demonstrated using this approach. With the advent of many applications that require wide frequency-



Figure 1.9: An ultra-low voltage SoC with on-chip PLLs

range of operation, the top-down approach remains the most favorable. Many state-ofthe-art wide-range frequency synthesizer PLLs were presented in the literature. Nevertheless, accurate prediction of the performance of the fabricated chip by accounting for variations due to wide-range operation was not clearly addressed in the literature. While the design of narrow-band PLLs using top-down approach is widely used, and many wide-range PLL circuit designs have emerged recently in response to technology demand; we aim at closing the gaps between the two trends by adapting available topdown approach to wide-range PLL design. Possible variations in the parameters of the PLL building blocks due to wide-range operation are predicted and investigated; and the variations are incorporated in the behavioral models. Experimental measurements are carried out to evaluate the accuracy of these predictions and deduce final conclusions.

## 1.1.2 Sub-1 V Operation

A system-on-chip (SoC) usually requires a clock generation source to drive multiple onchip blocks. An ultra-low voltage SoC is depicted in Fig. 1.9. It is advantageous to generate a sufficient range of frequencies from a single frequency synthesizer PLL to save power and area. In general, PLLs tend to be one of the most power consuming components. Being able to reduce the operating supply voltage of the frequency synthesizer PLL can further reduce power consumption and help integrate the analog part with the digital core by operating from the same supply on a single chip.

Another challenge that arises from low-voltage operation, for both analog and digital circuits, is the need for biasing circuits that operate in the sub-1V range, such as voltage regulators, bandgap references, and operational-amplifiers (op-amps). Providing solutions for these peripheral circuits will facilitate the design of efficient and robust digital circuits in the sub-1 V range, and will allow further integrability of the analog circuits on the same integrated circuit (IC).

# **1.2** Primary Contribution

The thesis deals with the design of frequency synthesizer PLLs and tackles the challenges of high frequency operation, wide tuning-range requirement, and low-voltage low-power constraints. We also tackle many of the challenges encountered in the design of sub-1 V CMOS circuits. The major contributions of this thesis can be divided into the following categories:

# 1.2.1 Wide Tuning-Range Frequency Synthesizer PLLs

The need for wide range frequency synthesizer PLLs has intensified in recent years, an so has the need for design methodology tailored for wide range operation. In this thesis, we investigate the large variations in the PLL components due to wide-range operation to ensure better predictability of the performance before tape-out without running closedloop transistor-level simulations of the PLL. A top-down approach to design wide range PLLs is demonstrated. That includes high-level behavioral modeling of the main building blocks of the PLL, design and layout of the individual components at the transistor level, and physical implementation on an IC prototype. With the design of an integer-N PLL that covers a continuous frequency range from 156.25 MHz to 10 GHz in 65 nm CMOS technology, we demonstrate how including these variations at the system level improves the accuracy of the predicted performance and reduces measurement errors. This part of the thesis work is described in Chapter 4 and in the this paper:

 Abdelfattah O.; Gal. G.; Roberts G.; Shih I.; Shih Y., "A Top-Down Design Methodology Encompassing Components Variations Due to Wide Range Operation in Frequency Synthesizer PLLs," accepted for publication in IEEE Transactions on Very Large Scale Integration Systems (TVLSI) in December 2015.

In addition, we provide an optimization methodology of the performance of the LC-tank voltage-controlled oscillator (VCO) in terms of tuning range, phase noise, and power to assist the designer make the critical choices based on the available budget. The design of wide tuning-range VCOs as critical blocks in the PLL was tackled mainly in Section 4.2.1. and in this paper:

 Abdelfattah O.; Shih I.; Roberts G.; Shih Y., "Optimization of LC-VCO Tuning Range under Different Inductor/Varactor Losses Limitations," IEEE 27th Canadian Conference on Electrical and Computer Engineering (CCECE), pp.1-5, May 2014.

The lack of a detailed qualitative and quantitative analysis for loop filter options in the design of PLLs makes the optimization process difficult and demands lots of trial-anderror steps. In this thesis, we provide an explicit comparison between different loop filter topologies and their effect on design parameters such as locking time, reference spurs attenuation, phase noise, and loop phase-margin. A quantitative comparison is provided, whenever possible, to assist the designer in selecting the optimum design for the desired specifications. The design of loop filters in frequency synthesizer PLLs was discussed mainly in Section 3.5. and partly in the following paper:

• Abdelfattah O.; Shih I.; Roberts G.; Shih Y., "Analytical comparison between passive loop filter topologies for frequency synthesizer PLLs," IEEE 11th International NEW Circuits and Systems Conference (NEWCAS), pp.1-4, Jun. 2013.

# 1.2.2 Ultra-Low-Voltage Frequency Synthesizer PLLs

The constant downscaling of power supply voltage will make it necessary for analog components to work in the sub-1 V range in the next generations. Different techniques need to be used to allow the PLL components to work effectively at high frequency with these supply voltage values. In this thesis, we make use of the top-down approach and deploy different low-voltage design techniques to achieve a competitive performance with the state-of-the-art ultra-low-voltage PLLs in the literature. The design and implementation of a low-noise frequency synthesizer PLL that operates from a 0.55 V supply and covers the frequency range from 860 MHz to 1.22 GHz is presented in Chapter 5 and summarized in this paper:

• Abdelfattah O.; Shih I.; Roberts G.; Shih Y., "A 0.55-V 1-GHz Frequency Synthesizer PLL for Ultra-low Voltage Ultra-low Power Applications," IEEE 6th Latin American Symposium on Circuits and Systems (LASCAS), Feb. 2015.

## **1.2.3** Sub-1 V Peripheral Circuits

Both analog and digital circuits operating in the sub-1 V range need robust peripheral circuits such as voltage regulators, bandgap references, and op-amps. Circuit design and implementation for ultra-low voltage op-amp, bandgap reference, and low drop-out (LDO) voltage regulator is presented in Chapter 6.

We present an operational-transconductance-amplifier (OTA) that operates from a sub-1 V power supply while achieving rail-to-rail input range. The proposed OTA combines two different ultra-low-voltage techniques to allow both minimum supply voltage operation and rail-to-rail input common-mode range. A novel biasing technique is also proposed to enhance the performance of the OTA. Using the proposed technique eliminates the need for extra biasing circuitry and ensures robustness against process variations under ultra-low-voltage conditions. Furthermore, the proposed technique substantially enhances the common-mode rejection and power-supply rejection of the OTA. The design of an ultra-low voltage op-amp that operates from a supply voltage as low as 0.35 V was demonstrated in the following papers:

- Abdelfattah, O.; Roberts G.; Shih I.; Shih Y., "An Ultra-Low-Voltage CMOS Process-Insensitive Self-Biased OTA with Rail-to-Rail Input Range," accepted for publication in IEEE Transactions on Circuits and Systems I (TCAS-I) in July 2015.
- Abdelfattah O.; Shih I.; Roberts G.; Shih Y., "A 0.35-V Bulk-Driven Self-Biased OTA with Rail-to-Rail Input Range in 65 nm CMOS," IEEE International Symposium on Circuits and Systems (ISCAS), pp. 257-260, May 2015.

An example for the design of an ultra-low-voltage bandgap reference was demonstrated in this paper:  Abdelfattah O.; Shih I.; Roberts G.; Shih Y., "A 0.6 V-Supply Bandgap Reference in 65 nm CMOS," IEEE 13th International NEW Circuits and Systems Conference (NEWCAS), Jun. 2015.

## 1.2.4 MOS Transistor Modeling

The design of analog components at the transistor level can be tedious and daunting due to the complexity of the transistor models which are becoming even more complex as CMOS technology advances. Therefore, characterizing the MOS transistor in an efficient and practical way can help reduce the design time and obviate the need to deal directly with the complex models. A simple tool to design analog CMOS circuits based on extracting some dimension-independent parameters of the transistor was built and used throughout the circuit design process in this thesis. This work is discussed in details in Section 2.4. and in the following paper:

• Abdelfattah O.; Shih I.; Roberts G., "A simple analog CMOS design tool using transistor dimension-independent parameters," IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1067-1070, May 2013.

In summary, this thesis covers three main topics. The design of wide-tuning range frequency synthesizer PLLs is covered in Chapter 4 and aims at providing a more accurate prediction of the performance in the presence of components variations. The design of ultra-low-voltage frequency synthesizer PLL is detailed in Chapter 5 using similar topdown design approach. Finally, in Chapter 6 we propose a novel OTA design for sub-1 V applications and demonstrate its use as a building block for other ultra-low-voltage circuits.

# 1.3 Thesis Overview

The thesis is divided into seven chapters. Following the first introductory chapter, Chapter 2 covers several topics that are deemed as necessary tools in our top-down approach to PLL design. First, an overview of the PLL from a system perspective is presented. The behavior and representation of the main building blocks of the PLL are discussed from a system point of view. Key parameters and definitions to quantify the performance of the PLL are also described. Noise sources, representation, and optimization in PLL design are discussed in further details. Next, we briefly discuss the system-level specifications of the PLL and how these specifications can be mapped into circuit parameters.

A short introduction to Verilog-A language is presented to enable behavioral modeling of the PLL building blocks in the chapters to come. We end the chapter by introducing a simple tool developed to characterize the MOS transistor using a set of normalized parameters. This tool will be used in the transistor-level design of the CMOS circuits used throughout the following chapters.

Chapter 3 covers the design of the individual building blocks of the PLL; namely VCO, PFD/CP, frequency dividers, and loop filter. For each of the VCO, PFD/CP, and frequency dividers, we present the principle of operation, non-idealities, performance metrics, and behavioral modeling using Verilog-A. Examples of transistor-level implementation of each of these blocks are presented and discussed. Finally, we discuss the various choices available for the design of the loop filter which directly impact the performance of the PLL. Analytical comparison between different loop filter topologies and orders are presented with detailed qualitative and quantitative analysis.

In Chapter 4, a top-down approach for the design of wide tuning-range frequency synthesizer PLLs is presented. The design methodology is based on a top-down approach that bridges the high-level behavioral modeling of the PLL building blocks and the transistor-level design of each block. The suggested behavioral models capture the variations in the performance of the PLL building blocks and their effect on the overall performance of the PLL. The methodology is capable of predicting the noise performance and loop dynamics of the PLL, while avoiding lengthy and impractical brute-force closedloop simulations of the PLL at transistor-level. To verify the design approach, an integer-N frequency synthesizer PLL that covers a continuous frequency operating range from 156.25 MHz to 10 GHz is designed, modeled, and fabricated in a 65 nm general-purpose CMOS technology. The measurement results from the fabricated chip are compared with the simulated results from the high-level behavioral models of the PLL building blocks, and conclusions are deduced.

In Chapter 5, an ultra-low-voltage PLL that operates from a supply voltage of 0.55 V is presented. The design choices of the building blocks of the PLL are discussed. Ultra-low-voltage design techniques are utilized in the circuit implementation of the PLL components. To verify the validity of the design, the PLL was fabricated in a general-purpose 65 nm CMOS technology. The PLL covers a frequency range from 860 MHz to 1.22 GHz, and consumes 3 mW while operating from a 0.55 V supply. Measurement results and simulation results are compared and discussed.

In Chapter 6, the design of the peripheral circuits needed for the ultra-low-voltage operation in the sub-1 V range is demonstrated. An ultra-low-voltage OTA is proposed

to tackle the challenges of low voltage operation in modern CMOS technologies. The proposed OTA simultaneously allows both minimum supply voltage operation and rail-to-rail input common-mode range. The proposed OTA deploys a novel self-biasing technique that significantly enhances the common-mode and power-supply rejection and also ensures robustness under expected levels of process variations. The proposed self-biasing technique eliminates the need for an extra biasing circuitry, allowing saving in area and power consumption. To verify the theoretical findings, a three-stage OTA for low-voltage application is designed and fabricated in a 65 nm CMOS technology. The proposed OTA provides a gain of 46 dB at a supply voltage of 0.5 V and a gain of 43 dB at a supply voltage of 0.35 V. Furthermore, bandgap reference and voltage regulator that operate from sub-1 V power-supply were designed utilizing the proposed ultra-low-voltage OTA. Simulation results are provided to verify the operation principles of the two circuits.

# Chapter 2

# PLL Design Using Top-Down Approach

In this chapter, we introduce the concepts and tools that will be used throughout the PLL design. The design process follows a top-down approach in transitioning from system level to transistor level. First, we introduce the PLL and its components from a system perspective with emphasis on frequency synthesis application. We discuss the different representations of noise in PLLs. Next, we discus the system-level specifications of the PLL and the use of behavioral modeling to describe the different PLL components. Finally, we introduce a MOS transistor characterization tool that will be used in the transistor-level design of the PLL components.

# 2.1 PLL from a System View

PLLs have a myriad of applications such as clock generation and distribution, de-skewing, jitter reduction, clock recovery, and frequency synthesis. The scope of this thesis is fo-



Figure 2.1: A block diagram of a conceptual frequency synthesizer



Figure 2.2: A general architecture of a heterodyne receiver

cused on frequency synthesis. A frequency synthesizer, conceptually depicted in Fig. 2.1, generates multiple frequencies from a stable extremely low-noise reference signal. The output frequency of the synthesizer is usually controlled by a digital code to generate different output frequencies with a well-controlled step size. PLL-based frequency synthesizers are indispensable components in most wireless transceivers. A general architecture of a heterodyne receiver is shown in Fig. 2.2. The need for a clean local oscillator (LO) to up-convert the transmitted data in a transmitter or down-convert the received data in a receiver necessitates the use of a PLL-based frequency synthesizer to meet the stringent specifications in wireless standards.

A basic frequency synthesizer PLL block diagram is shown in Fig. 2.3. The main components of a typical implementation are: voltage-controlled oscillator (VCO), phasefrequency detector (PFD), charge pump (CP), loop filter (LF), and programmable divider. The division ratio N is sometimes distributed between a fixed prescalar division ratio P and a smaller programmable division ratio M, where  $N = P \times M$ . The addition of a prescalar relaxes the speed requirement of the programmable divider.

In brief, a PLL is a negative feedback system that forces the phase of a scaled output signal to follow the phase of a clean reference signal. It is imperative to bear in mind that the variable of interest here is the signal phase. The comparison process occurs in the PFD where voltage pulses proportional to the difference in phase between the two input signals are generated. These PFD output pulses control the CP switches, and force the CP to source or sink current at its output. The CP output current is converted to a stable dc voltage  $V_{CTRL}$  using a low-pass filter.  $V_{CTRL}$  tunes the output frequency of the VCO such that the phase of the divided-down VCO output signal is equal to the phase of the reference signal.

Based on whether the division ratio N is an integer or a rational number, frequency synthesizer PLLs can be classified into two types: integer-N PLLs or fractional-N PLLs. An integer-N PLL synthesizes frequencies that are multiple of the reference frequency  $f_{ref}$ . Therefore, the smallest step size that can be implemented using an integer-N PLL is limited by the reference frequency. If a fixed prescalar with a division ratio P is used, then the step size becomes  $P \times f_{ref}$ , which further limits the achievable frequency resolution. On the other hand, a fractional-N PLL allows the use of a step size that is equal to a fraction of the reference frequency. Therefore, a fractional-N PLL can produce a smaller step size compared to an integer-N PLL for the same reference frequency. However, the fractional-N PLL produces spurious tones at offset frequencies that are fraction of the reference frequency. The proximity of the spurs from the carrier makes them hard to filter out by the loop. Delta-sigma ( $\Delta\Sigma$ ) techniques are often used to shape the noise and lower the spurs power at the expense of more complexity. In addition, for the same step size a fractional-N PLL allows the use of a higher reference frequency compared to an integer-N PLL to produce the same output frequency. A higher reference frequency means that a smaller division ratio N can be used. This helps reduce the close-in noise at the output of the PLL as will be explained later in this chapter. In cases where the required frequency step size is already large, using a fractional-N PLL to produce the same output frequency allows the use of a very high frequency for the reference signal. A reference oscillator with a very high frequency, if available, can be very expensive in



Figure 2.3: A block diagram of a basic frequency synthesizer PLL

practice. Therefore, the use of fractional-N PLLs is avoided in cases where the required step size is large. Only integer-N PLLs will be used throughout this thesis. However, most of the discussion applies equally to both integer-N and fractional-N PLLs.

## 2.1.1 PLL Components

A PLL can be treated as a linear system if its loop bandwidth is much smaller than its reference frequency. As shown in [13] and [14], a reference frequency that is at least 10 times the loop bandwidth is considered a safe margin to overlook the sampling effect of the PFD and approximate the PLL as a linear system. In order to analyze the PLL from a system-level, one needs to understand the behavior of its main building blocks to model them properly. In this section, we introduce the basic behavior of each component in the PLL system. A more detailed analysis and circuit implementation will be discussed in further details in Chapter 3.

## VCO

A VCO generates an output frequency  $\omega_{out} = 2\pi f_{out}$  that is dependent on an input control voltage  $V_{CTRL}$ . The derivative of the output frequency with respect to the control voltage is defined as the VCO gain  $K_{VCO}$ ; that is

$$K_{VCO} = \frac{d\omega_{out}}{dV_{CTRL}} .$$
(2.1)

If we assume, for simplicity, that the VCO gain is constant over some control voltage range, then we can obtain the linear relationship

$$\omega_{out} = K_{VCO} \times V_{CTRL} \ . \tag{2.2}$$

The phase is the variable of interest in the PLL system locking process. The signal phase, in general, is the integration of its frequency. Therefore, the output phase  $\phi_{out}(t)$  is related to the output frequency as

$$\phi_{out}(t) = \int \omega_{out}(t) \ dt = K_{VCO} \int_0^t V_{CTRL}(\tau) \ d\tau \ . \tag{2.3}$$

The equivalent Laplace representation of the relationship between the output phase and the control voltage is given as

$$\frac{\phi_{out}(s)}{V_{CTRL}(s)} = \frac{K_{VCO}}{s} . \tag{2.4}$$

It can be seen from Eq. (2.4) that due to its integrating nature the VCO introduces a pole at dc in the PLL system. In general, a PLL has at least one pole at dc. The PLL type is determined by the number of poles it has at dc. A PLL that has only one pole at dc is a type I PLL.

### PFD/CP

A PFD compares the phases of the reference signal and the divided-down VCO signal and produces an output (usually voltage pulses) that is related to the phase difference  $\Delta \phi_P$  between the two inputs.

A common implementation of PFD is the tri-state PFD which produces two outputs (UP and DN) that control the switches of a subsequent circuit called the charge pump (CP). The PFD/CP block diagram implementation is shown in Fig. 2.4. The charge pump consists of two ideally matched current sources that pump charges into or outof the output node. If the reference signal phase is leading the feedback signal phase, the PFD generates pulses at its UP output instructing the CP to source current to the output node in order to accelerate the VCO signal and force it to be in phase with the reference signal. On the other hand, if the feedback signal phase is leading the reference signal, the PFD generates pulses at its DN output instructing the CP to sink current from the output node forcing the VCO to slow down until the loop is locked in phase. The CP current  $I_{CP}$  charges or discharges the output node for a time length  $\Delta t_P$  that is proportional to the phase difference between the two inputs. The average output current  $\langle i_P \rangle$  over the reference period  $T_{ref}$  is

$$\langle i_P \rangle = \frac{\Delta t_P}{T_{ref}} I_{CP} = \frac{\Delta \phi_P}{2\pi} I_{CP}$$
 (2.5)

Thus, the PFD/CP can be represented as a fixed gain  $K_P$  that in this case is given as

$$K_P = \frac{\langle i_P \rangle}{\Delta \phi_P} = \frac{I_{CP}}{2\pi} . \tag{2.6}$$

It is important to note that the PFD/CP works as a discrete sampling component. The linearization model is valid only if the loop bandwidth is much smaller than the reference frequency.



Figure 2.4: A block diagram of the PFD/CP

#### Frequency divider

A frequency divider generates an output signal whose frequency is a scaled down by some division ratio N compared to the frequency of the input signal. Since the phase is the integral of the frequency, the phase of the output signal will also be divided by N. Therefore, the output phase of the divider  $\phi_{div}(t)$  is related to the output phase of the VCO by

$$\frac{\phi_{div}(s)}{\phi_{out}(s)} = \frac{1}{N} \ . \tag{2.7}$$

#### Loop filter

The loop filter converts the CP current pulses into a voltage that controls the VCO output frequency. Thus, it has an impedance transfer function and will be denoted here as Z(s). The loop filter is a low-pass filter that smooths the non-linear pulses to produce a dc voltage that controls the VCO. The loop filter affects the loop dynamics as well as the frequency response of the PLL.

Fig. 2.5 shows two possible implementations of a simple loop filter. More details and comparison between different possible implementations will be discussed in Section 5 of Chapter 3. In the first-order filter shown in Fig. 2.5(a), the capacitor  $C_1$  integrates the CP current and converts it into voltage, while the resistor  $R_1$  introduces a zero that stabilizes the loop. The impedance of this loop filter is given by

$$Z(s) = \frac{1 + sC_1R_1}{sC_1} \ . \tag{2.8}$$

In order to further smooth the ripples of the control voltage, the second-order loop filter shown in Fig. 2.5(b) can be used. The impedance of this loop filter is given by

$$Z(s) = \frac{1 + sC_1R_1}{sC_T} \frac{1}{1 + a_1(sC_1R_1)}$$
(2.9)

where  $a_1 = C_2/C_T$  and  $C_T = C_1 + C_2$ .

# 2.1.2 Frequency Response of the PLL

After modeling the individual components of the frequency synthesizer PLL, the linearized model of the PLL can be constructed as shown in Fig. 2.6. The loop-gain transfer-function of the PLL is given by

$$LG(s) = \frac{K_P K_{VCO} Z(s)}{N.s} . \tag{2.10}$$

The loop-gain transfer-function of the PLL using the first-order loop filter in Fig. 2.5(a) is given by

$$LG(s) = \frac{K_P K_{VCO}}{N} \cdot \frac{1 + sC_1 R_1}{s^2 \cdot C_1} , \qquad (2.11)$$

and the loop-gain transfer-function of the PLL using the second-order loop filter in Fig. 2.5(b) is given by

$$LG(s) = \frac{K_P K_{VCO}}{N} \cdot \frac{1 + sC_1 R_1}{s^2 \cdot C_T} \cdot \frac{1}{1 + a_1(sC_1 R_1)}$$
(2.12)

Note from both Eqs. (2.11) and (2.12) that the PLL has two poles at dc, which makes it a type II PLL where the resistor  $R_1$  is used to introduce a zero to ensure the loop stability. Note also that the order of the transfer function in both equations, also known



Figure 2.5: Loop filter examples: (a) first-order (b) second-order

as the PLL order, is one degree higher than the order of the loop filter. Therefore, the PLL described by Eq. (2.11) is a second-order PLL, while the PLL described by Eq. (2.12) is a third-order PLL.

Both PLLs in Eqs. (2.11) and (2.12) have two poles ( $\omega_{p_1}$  and  $\omega_{p_2}$ ) at dc and one zero ( $\omega_z$ ) where

$$\omega_z = \frac{1}{R_1 C_1} \ . \tag{2.13}$$

while the third-order PLL described in Eq. (2.12) has one extra pole ( $\omega_{p_3}$ ) where

$$\omega_{p_3} = \frac{1}{a_1 R_1 C_1} \ . \tag{2.14}$$

The magnitude and phase of the loop-gain of the third-order PLL is shown in Fig. 2.7. As illustrated in the figure, the two poles at dc cause the magnitude to have a slope of -40 dB per decade, and the phase shift to be -180° at low frequency. The zero at  $\omega_z$  increases the slope to -20 dB per decade, and more importantly increases the phase shift above -180°. The amount by which the phase shift is above -180° at the cross-over point is defined as the **phase margin** PM, and the cross-over frequency point at which the loop-gain magnitude is equal unity is defined as the **loop bandwidth** ( $\omega_c$ ). The loop bandwidth can be obtained by equating the loop-gain expression in Eqs. (2.11) and (2.12) to unity.

The phase margin expression is given as

$$PM = \tan^{-1}\left(\frac{\omega_c}{\omega_z}\right) - \tan^{-1}\left(\frac{\omega_c}{\omega_{p_1}}\right) - \tan^{-1}\left(\frac{\omega_c}{\omega_{p_2}}\right) - \tan^{-1}\left(\frac{\omega_c}{\omega_{p_3}}\right) + 180^\circ \quad (2.15a)$$

$$PM = \tan^{-1} \left(\frac{\omega_c}{\omega_z}\right) - \tan^{-1} \left(\frac{\omega_c}{\omega_{p_3}}\right) . \qquad (2.15b)$$



Figure 2.6: Linearized model of the frequency synthesizer PLL

In the case of a second-order PLL  $\omega_{p_3} = \infty$ , and the phase margin expression in Eq. (2.15b) is reduced to the term  $\tan^{-1}(\omega_c/\omega_z)$  only. Thus, the loop bandwidth  $\omega_c$  should be selected high enough to ensure enough phase margin in the loop.

In the case of a third-order PLL, it is advantageous to select  $\omega_c$  that maximizes the phase margin to ensure stability over a wide range of variations in the loop parameters. The maximum phase margin  $(PM_{max})$  is determined by differentiating Eq. (2.15b) with respect to  $\omega_c$  [15]. One can find that the phase margin is maximized when

$$\omega_c = \sqrt{\omega_z . \omega_{p_3}} = \frac{1}{R_1 C_1 \sqrt{a_1}} , \qquad (2.16)$$

which results in maximum phase margin given by

$$PM_{max} = \tan^{-1}\left(\frac{1}{\sqrt{a_1}}\right) - \tan^{-1}(\sqrt{a_1}) = \tan^{-1}\left(\frac{1-a_1}{2\sqrt{a_1}}\right)$$
(2.17)

where  $a_1 = C_2/(C_1 + C_2)$ . For example, to design a third-order PLL with a maximum phase margin of 60°,  $a_1$  should be chosen to be 0.072, which results in  $C_2 \approx 0.08C_1$ . More details on higher order PLL design will be provided in Section 5 of Chapter 3.

The closed-loop transfer-function of a general order PLL is given by

$$\frac{\phi_{out}(s)}{\phi_{ref}(s)} = N \cdot \frac{\frac{K_P K_{VCO}}{N.S} \cdot Z(s)}{1 + \frac{K_P K_{VCO}}{N.S} \cdot Z(s)}$$
(2.18)



Figure 2.7: The magnitude and phase of the loop-gain of the third-order PLL
For a first-order loop filter, the closed-loop transfer-function of the resulting type II second-order PLL is given by

$$\frac{\phi_{out}(s)}{\phi_{ref}(s)} = N \cdot \frac{\frac{K_P K_{VCO}}{N.C_1} \cdot (R_1 C_1 s + 1)}{s^2 + K_P K_{VCO} R_1 s + \frac{K_P K_{VCO}}{C_1}}$$
(2.19)

Another point of interest in the closed-loop frequency response of a PLL is the transfer-function of the phase error  $\phi_e$ , where  $\phi_e = \phi_{ref} - \phi_{out}/N$ . The transfer-function is given by

$$\frac{\phi_e(s)}{\phi_{ref}(s)} = \frac{s^2}{s^2 + K_P K_{VCO} R_1 s + \frac{K_P K_{VCO}}{C_1}} .$$
(2.20)

If a frequency step is applied at the input; that is  $f_{ref}(s) = 1/s$  in Laplace domain, this will result in a ramp in the input phase; that is  $\phi_{ref}(s) = 1/s^2$  in Laplace domain. The steady-state phase error of a type II second-order PLL can be evaluated using the final value theorem as

$$\phi_e(t=\infty) = \lim_{s \to 0} s\phi_e(s) = s. \left(\frac{1}{s^2}\right) \cdot \frac{s^2}{s^2 + K_P K_{VCO} R_1 s + \frac{K_P K_{VCO}}{C_1}} = 0$$

Therefore, the phase error in this example is zero. In general, the absence of steadystate phase error in type II PLLs is due to the existence of two integrators at dc in the loop; one is caused by the phase integrating nature of the VCO and the second is caused by the capacitor  $C_1$  that integrates the CP current. Due to this feature, type II is the most common implementation of PLL in practical IC designs.

## 2.2 PLL Noise

Ideally, a PLL should produce a pure single tone at the desired frequency the loop is tuned to. However, there are many non-idealities that affect the output signal in practice and result in deviation from the expected clean signal. A typical PLL output signal measurement using a spectrum analyzer shows a frequency spectrum similar to that shown in Fig. 2.8. It can be seen that in addition to the carrier signal the spectrum shows two sidebands that extend around the carrier signal as well as some discrete spurs at different frequencies in the vicinity of the carrier signal.



Figure 2.8: A typical frequency spectrum of a PLL output signal

#### 2.2.1 Output Spectrum Noise Representation

If we assume that the PLL has a fixed limited output amplitude, then the phase fluctuation  $\phi_n(t)$  modulate the output signal  $v_{out}(t)$  as follows

$$v_{out}(t) = A \cos[\omega_{out}t + \phi_n(t)]$$
(2.21)

where  $\omega_{out}$  is the carrier frequency, A is the carrier amplitude, and  $\phi_n(t)$  is the random phase fluctuations from the desired phase  $\omega_{out}t$ .

If we assume that the phase fluctuation is a single tone at a frequency  $\omega_m$ ; that is

$$\phi_n(t) = \phi_p \sin(\omega_m t) \tag{2.22}$$

where  $\phi_p$  is the amplitude of the phase fluctuation, then the output signal is given by substituting Eq. (2.22) in Eq. (2.21) which yields

$$v_{out}(t) = A \cos[\omega_{out}t + \phi_p \sin(\omega_m t))]. \qquad (2.23)$$

By applying basic trigonometric properties, Eq. (2.23) can be simplified as

$$v_{out}(t) = A \left\{ \cos(\omega_{out}t) + \frac{\phi_p}{2} [\cos(\omega_{out} + \omega_m)t - \cos(\omega_{out} - \omega_m)t] \right\} .$$
(2.24)

Therefore, a single tone phase fluctuation at frequency  $\omega_m$  yields two sidebands at an offset frequency  $\omega_m$  from the carrier. The same argument can be applied to the spurious

tones where the phase fluctuation term  $\phi_n(t)$  is considered a spurious tone at a frequency offset from the carrier. Spurs are usually generated in an integer-*N* PLL at multiples of the reference frequency due to mismatch in the CP or other forms of leakage. On the other hand, fractional-*N* PLLs, due to their inherent fractional division process, produce spurs at fractions of the reference frequency.

Oftentimes, the spectrum analyzer measures the power spectrum density (PSD) of the output signal in a 1-Hz bandwidth and normalizes it to the carrier power. The measured value is called the single-sideband (SSB) phase noise or simply the **phase noise**  $\mathcal{L} \{\omega_m\}$  [16], which can be expressed as

$$\mathcal{L}\left\{\Delta\omega_m\right\} = 10 \ \log\left(\frac{P_{noise}(\omega_{out} + \omega_m)}{P_{carrier}(\omega_{out})}\right)$$
(2.25)

where  $P_{carrier}(\omega_{out})$  is the carrier power and  $P_{noise}(\omega_{out} + \omega_m)$  is the noise power in a 1-Hz bandwidth at an offset  $\omega_m$  from the carrier.

The phase noise can be related to the phase fluctuation amplitude  $\phi_p$  by substituting into Eq. (2.25) the corresponding terms from Eq. (2.24), which yields

$$\mathcal{L}\{\omega_m\} = 10 \ \log\left[\frac{\frac{1}{2}(A\phi_p/2)^2}{\frac{1}{2}A^2}\right] = 10 \ \log\left(\frac{\phi_p^2}{4}\right) \ . \tag{2.26}$$

Since  $\phi_{rms}^2 = \frac{1}{2}\phi_p^2$ , we can write

$$\mathcal{L}\left\{\omega_{m}\right\} = 10 \, \log\left(\frac{\phi_{rms}^{2}}{2}\right) = 10 \, \log\left(\phi_{n,out}^{2}\right) \tag{2.27}$$

where  $\phi_{rms}$  is the root mean square of the of the PSD in rad<sup>2</sup>/Hz, also known as the **rms jitter**, and  $\phi_{n,out}^2 = \phi_{rms}^2/2$  is the magnitude of the SSB phase noise spectrum.

The **integral rms phase error**  $\sigma_{\phi}$  in degrees over the frequency range from  $\omega_{m,min}$  to  $\omega_{m,max}$  is defined as

$$\sigma_{\phi} = \frac{180^{\circ}}{\pi} \sqrt{\int_{\omega_{m,min}}^{\omega_{m,max}} 2 \ \phi_{n,out}^2(\omega_m) \ d\omega_m} , \qquad (2.28)$$

which corresponds to integral rms phase jitter  $\sigma_t$  (in seconds) given by

$$\sigma_t = \frac{\sigma_\phi}{\omega_{out}} \times \frac{\pi}{180^\circ} \ . \tag{2.29}$$



Figure 2.9: Conversion of amplitude noise into timing jitter

#### 2.2.2 Phase Noise and Timing Jitter

Noise in a PLL originates from different sources; some of which are extrinsic to the PLL components (e.g. power supply and substrate noise, coupling and interference from adjacent circuitry), and some of which are intrinsic such as passive and active devices noise (e.g. thermal noise, flicker noise, shot noise). We will focus here on the intrinsic sources of noise, as the extrinsic sources can be easily isolated and accounted for. In general, noise appears at the output of the PLL as timing jitter in time domain or phase noise in the frequency domain. Based on the nature of the noisy component, the timing jitter can be classified into two types: synchronous (or phase-modulated) jitter, and accumulating (or frequency-modulated) jitter [17, 18].

Synchronous jitter represents the fluctuation in the delay of the transition at the output with respect to the input. It is used with components that are driven by an input such as phase-frequency-detector (PFD), charge pump (CP), frequency dividers, and voltage signal buffers.

To extract the synchronous jitter from driven components, the threshold-crossing event is observed at the output. Assume a noiseless periodic output signal v(t). Because of the noise added to the output, the threshold crossing events are displaced by an amount  $n_v(t)$ . Thus, the noisy output becomes  $v_n(t) = v(t) + n_v(t)$ . In order to convert the displacement in amplitude  $n_v(t)$  into displacement in time (i.e. timing jitter), the slew rate of the periodic signal at the time of the threshold-crossing event is required. This is depicted in Fig. 2.9.

Synchronous jitter is usually characterized by the edge-to-edge jitter  $\sigma_{ee}$  [18] that relates the variance of the noise amplitude  $n_v(t)$  to the slew rate of the periodic signal at the threshold crossing  $t_c$  as

$$\sigma_{ee} = \frac{\sqrt{var\left(n_v(t_c)\right)}}{dv(t_c)/dt} \ . \tag{2.30}$$

On the other hand, accumulating jitter represents the uncertainty of the output transition with respect to a previous output transition. It is usually used with autonomous components that do not have a driving input such as reference oscillator and voltage controlled oscillator (VCO). Accumulating jitter is usually characterized by cycle (or period) jitter  $\sigma_c$  [19]; that is the variance of the time period  $T_n$  to the average time period  $T_{avg}$  and is given by

$$\sigma_c = \sqrt{\lim_{N \to \infty} \left[ \frac{1}{N} \cdot \sum_{n=1}^{N} (T_n - T_{avg})^2 \right]}.$$
(2.31)

Oscillator noise is usually viewed in the frequency domain as phase noise around the oscillation frequency. In the absence of flicker noise, the cycle jitter and the phase noise are related as

$$\sigma_c = \sqrt{\phi_n^2(\omega_m) \frac{\omega_m^2}{\omega_o^3}} \tag{2.32}$$

where  $\phi_n^2(\omega_m)$  is the phase noise magnitude in rad<sup>2</sup>/Hz at an offset frequency  $\omega_m$  from the oscillation frequency  $\omega_o$  [18,19].

#### 2.2.3 Noise Representation in PLL Linearized Model

Each PLL component contributes to the total noise at the output in a different way. Using the PLL linearized model, the noise generated by each component can be added at the output of each component as shown in Fig. 2.10. More often than not, the noise generated by the PFD/CP, frequency divider, and the loop filter is referred back to the input as one combined noise source, while the VCO noise is referred to the output as shown in the linearized PLL model in Fig. 2.11. The transfer function  $H_{in}(s)$  from the input referred noise to the output of the PLL is given by

$$H_{in}(s) = \frac{\phi_{n,out}(s)}{\phi_{n,in}(s)} = N \frac{\frac{K_{VCO}K_P}{N}Z(s)}{s + \frac{K_{VCO}K_P}{N}Z(s)}$$
(2.33)



Figure 2.10: Different noise sources in a PLL linearized model

where  $K_{vco}$  is the VCO gain,  $K_P$  is the charge pump gain, N is the frequency divider ratio, and Z(s) is the loop filter impedance. The input-referred noise undergoes a lowpass filter as implied by Eq. (2.33), and appears amplified by the divider ratio at the output for frequencies less than the loop bandwidth. Thus, it is called **in-band noise**.

Given the loop gain LG(s) expression in Eq. (2.10), we can rewrite

$$H_{in}(s) = N \frac{LG(s)}{1 + LG(s)}.$$
(2.34)

The transfer function  $H_{VCO}(s)$  from the VCO noise to the output of the PLL is given by

$$H_{VCO}(s) = \frac{\phi_{n,out}(s)}{\phi_{n,VCO}(s)} = \frac{s}{s + \frac{K_{VCO}K_P}{N}Z(s)}.$$
 (2.35)

The VCO noise undergoes a high-pass filter as implied by Eq. (2.35), and dominates the **out-of-band noise**; that is for frequencies higher than the loop bandwidth.

Rewriting the transfer function  $H_{VCO}(s)$  in terms of the loop gain LG(s), we can write

$$H_{VCO}(s) = \frac{1}{1 + LG(s)}.$$
(2.36)

It is important to note here that  $H_{in}(s)$  is a low-pass filter while  $H_{VCO}(s)$  is a high-pass filter; and both transfer functions have the same 3-dB bandwidth which is determined by LG(s).



Figure 2.11: Noise sources in a PLL linearized model combined into two main sources



Figure 2.12: Contribution of noise components at the output of the PLL

## 2.2.4 Noise Optimization in PLL Design

The variance of the jitter at the output of the PLL is related to the area under the phase noise spectrum and is given (in  $rad^2$ ) as

$$\sigma_{\phi,PLL}^2 = \int_0^\infty 2 \ \phi_{n,out}^2(\omega_m) \quad d\omega_m \tag{2.37}$$

The total phase noise at the output of the PLL is a combination of the in-band noise scaled by the transfer function  $H_{in}(s)$  and the out-of-band noise (i.e. VCO noise) scaled by the transfer function  $H_{VCO}(s)$ , as shown in Fig. 2.12. Therefore, the variance of the jitter at the output of the PLL is equal to the summation of the variance of the jitter at the output of the PLL due to in-band noise and the variance of the jitter at the output of the PLL due to VCO noise [20], i.e.

$$\sigma_{\phi,PLL}^2 = \sigma_{\phi,in}^2 + \sigma_{\phi,VCO}^2 \ . \tag{2.38}$$

Assume a given open-loop gain  $LG_0(s)$  with an arbitrary 3-dB bandwidth  $\omega_{c,0}$  that results in a closed-loop in-band transfer function  $H_{in,0}(s)$  where

$$H_{in}(s) = H_{in,0}\left(s.\frac{\omega_{c,0}}{\omega_c}\right) \tag{2.39}$$

and a closed-loop VCO noise transfer function  $H_{VCO,0}(s)$  where

$$H_{VCO}(s) = H_{VCO,0}\left(s \cdot \frac{\omega_{c,0}}{\omega_c}\right)$$
 (2.40)

The variance of the jitter at the output of the PLL due to in-band noise is given by

$$\sigma_{\phi,in}^2 = \int_0^\infty 2 \ \phi_{n,in}^2(\omega_m) |H_{in}(j\omega_m)|^2 \ d\omega_m$$
(2.41)

$$= \int_0^\infty 2 \ \phi_{n,in}^2(\omega_m) |H_{in,0}(j\omega_m \cdot \frac{\omega_{c,0}}{\omega_c})|^2 \ d\omega_m$$
(2.42)

$$= 2 \phi_{n,in}^2(\omega_m) \left(\frac{\omega_c}{\omega_{c,0}}\right) \int_0^\infty |H_{in,0}(j\omega_m)|^2 d\omega_m$$
(2.43)

where  $s = j\omega_m$ . Thus,

$$\sigma_{\phi,in}^2 = 2N^2 \phi_{n,in}^2(\omega_m) \left(\frac{\omega_c}{\omega_{c,0}}\right) \int_0^\infty \left|\frac{LG_0(s)}{1 + LG_0(s)}\right|^2 \quad d\omega_m .$$
(2.44)

Similarly, the variance of the jitter at the output of the PLL due to VCO noise is given by

$$\sigma_{\phi,VCO}^2 = \int_0^\infty 2 \,\phi_{n,VCO}^2(\omega_m) |H_{VCO}(j\omega_m)|^2 \, d\omega_m \,.$$
(2.45)

If the  $1/f^2$  region at the output spectrum is dominant, the VCO phase noise can be expressed as

$$\phi_{n,VCO}^2(\omega_m) = \phi_{n,VCO}^2(\omega_r) \cdot \left(\frac{\omega_r^2}{\omega_m^2}\right).$$
(2.46)



Figure 2.13: Selection of optimum  $f_c$  to minimize total PLL jitter

where  $\phi_{n,VCO}^2(\omega_r)$  is the measured VCO noise at a certain offset frequency  $\omega_r$ . Substituting Eqs. (2.40) and (2.46) into Eq. (2.45), we can write

$$\sigma_{\phi,VCO}^2 = \int_0^\infty 2 \ \phi_{n,VCO}^2(\omega_r)\omega_r^2 |H_{VCO,0}(j\omega_m,\frac{\omega_{c,0}}{\omega_c})|^2 \ \frac{d\omega_m}{\omega_m^2}$$
(2.47)

$$=2 \phi_{n,VCO}^2(\omega_r)\omega_r^2\left(\frac{\omega_{c,0}}{\omega_c}\right) \int_0^\infty |H_{VCO,0}(j\omega)|^2 \quad \frac{d\omega_m}{\omega_m^2}.$$
 (2.48)

Thus,

$$\sigma_{\phi,VCO}^2 = 2 \phi_{n,VCO}^2(\omega_r)\omega_r^2\left(\frac{\omega_{c,0}}{\omega_c}\right) \int_0^\infty \left|\frac{1}{s\left[1 + LG_0(s)\right]}\right|^2 d\omega_m.$$
(2.49)

The results in Eqs. (2.44) and (2.49) imply that increasing the 3-dB bandwidth increases the output jitter due to the in-band noise where as decreasing the 3-dB bandwidth increases the output jitter due to VCO noise. It can be shown that the optimum 3-dB bandwidth at which the total output jitter is minimized [20] is expressed as

$$\omega_{c,opt} = \omega_{c,0} \sqrt{\frac{\phi_{n,VCO}^2(\omega_r)\omega_r^2}{N^2\phi_{n,in}^2(\omega_m)}} \cdot \frac{\int_0^\infty \left|\frac{1}{s[1+LG_0(s)]}\right|^2 d\omega_m}{\int_0^\infty \left|\frac{LG_0(s)}{1+LG_0(s)}\right|^2 d\omega_m}.$$
(2.50)

Substituting Eq. (2.50) with typical values of  $LG_0(s)$  and  $f_{c,0}$  into Eq. (2.46), it can be shown [20] that

$$\phi_{n,VCO}^2(\omega_{c,opt}) \simeq N^2 \phi_{n,in}^2(\omega_m) \tag{2.51}$$

which implies that in order to minimize the jitter at the output of the PLL, the optimum 3-dB bandwidth  $\omega_{c,opt}$  must be selected approximately where the spectrum of the in-band noise (scaled by  $N^2$ ) and the spectrum of the VCO noise intersect, as shown in Fig. 2.13.

Furthermore, by substituting Eq. (2.50) into Eqs. (2.44) and (2.49), we obtain

$$\sigma_{\phi,in}^2 = \sigma_{\phi,VCO}^2 = \sigma_{\phi,PLL}^2 / 2 , \qquad (2.52)$$

which means that designing a PLL with optimum 3-dB bandwidth  $\omega_{c,opt}$  results in equal contribution to the output jitter from the in-band noise and the VCO noise.

## 2.3 System-Level Design

The PLL design is performed at two different levels: system level and transistor level. The PLL is often a part of a larger system. At the system level, the general behavior of the PLL and its components is decided and described using behavioral models. Next, if the performance is deemed satisfactory, each component will be designed at the transistor level.

#### 2.3.1 PLL Design Specifications

Based on the end-application or the communication standards that need to be covered, a frequency synthesizer PLL must meet certain specifications such as frequency range, channel spacing, phase noise, jitter, output spurs, transient behavior, and power consumption. To obtain these specifications, the system-level requirements need to be mapped into circuit parameters.

#### Frequency range and channel spacing

A frequency synthesizer must cover the entire frequency range allocated for the communication standard of the application. It should also allow switching between the channels within that frequency range with a step size equal to the channel spacing in that frequency range. These specifications are obtained directly from the system-level frequency planning dictated by the standard. For example, a receiver for GSM-900 standard must cover the frequency range 935-960 MHz with channel spacing of 200 kHz [21].



Figure 2.14: Effect of phase noise on the output in the presence of large interferers

#### Phase noise and jitter

The frequency synthesizer PLL should provide a pure tone at the desired LO frequency to mix with the RF input signal and downconvert it to IF as depicted by Fig. 2.14. Because of the phase noise of the LO, if an interferer (or a blocker)  $A_b$  exists in the vicinity of the input RF signal, the skirt of the LO phase noise will downconvert that interferer to IF as well. The system-level specifications dictate that the ratio of the power of the downconverted interferer due to the LO phase noise, i.e.  $P_b$ , to the power of the desired channel, i.e.,  $P_s$  must be less than a certain value. Assuming that the loop bandwidth is smaller than the channel spacing which is often the case, the phase noise profile of the PLL is approximated by  $C/f^2$  where C is a constant [22]. To obtain the value of C, a large interferer is applied at the adjacent channel center frequency; that is at  $f_{min}$  and  $f_{max}$  frequency offsets from the lower and upper limits of the desired channel, respectively, as shown in Fig. 2.14. Thus, the downconverted phase noise power to the desired channel power is given by

$$\frac{P_b}{P_s} = a \int_{f_{min}}^{f_{max}} \frac{C}{f^2} df$$
(2.53)

where  $a = 10^{A_b/10}$ .

Solving for the constant C and assuming the existence of n blockers, one can write the phase noise expression of the PLL in the  $1/f^2$  region as

$$\mathcal{L}\{f\} = 10 \log \left[ \frac{P_i/P_s}{\sum_{i=1}^n a_i (\frac{1}{f_{min_i}} - \frac{1}{f_{max_i}})} / f^2 \right].$$
 (2.54)

For example, IEEE 802.11a/g standard requires a dual-band receiver at 2.4 GHz and 5 GHz with channel spacing of 20 MHz [22]. The receiver is tested with two blockers; one is 16 dB above the desired channel at 20 MHz offset and the second is 32 dB above the desired channel at 40 MHz offset from the center of the desired channel. The numerator in Eq. (2.54) can be calculated by substituting  $a_1 = 40$  ( $A_{b_1} = 16$  dB),  $f_{min_1} = 10$  MHz and  $f_{max_1} = 30$  MHz for the first blocker, and  $a_2 = 1585$  ( $A_{b_2} = 32$  dB),  $f_{min_2} = 30$  MHz and  $f_{max_2} = 50$  MHz for the second blocker. For  $P_b/P_s$  of less than -20 dB, Eq. (2.54) is given by

$$\mathcal{L}\left\{f\right\} = 10\log\left(\frac{420}{f^2}\right). \tag{2.55}$$

Thus, we conclude that the PLL must have phase noise less than -94 dBc/Hz at 1 MHz offset from the carrier, assuming the corner frequency of the VCO is below 1 MHz.

In addition, to reduce the effect of noise on the signal constellation the rms jitter must be less than a certain value. For example, IEEE 802.11a/g standard requires that the integrated jitter remains less than 1 °. Both phase noise and rms jitter requirements must be achieved simultaneously.

#### Output spurs

Another source of signal corruption in a frequency synthesizer PLL is the output spurs, which can downconvert undesirable signals in adjacent channels. For a blocker that is  $A_b$  dB higher than the desired channel at an offset frequency f, to achieve a spurs power to desirable-channel power ratio of  $P_{spur}/P_s$ , the synthesizer output spurs (in dB) at fmust satisfy

$$A_{spur} \le \left(P_{spur}/P_s\right)|_{dB} - A_b. \tag{2.56}$$

#### Locking time

The speed of a frequency synthesizer is often characterized by the locking time. The locking time is defined as the time it takes the frequency synthesizer in the unlocked state to enter the locked state and settle at the desired carrier frequency. For an approximated second-order PLL [15], the locking time to  $\alpha$  of the final frequency value in response to a small change  $\Delta n$  in the divider ratio N can be approximated as

$$T_L \simeq \frac{1}{\zeta \omega_n} \ln\left(\frac{\Delta n}{N|\alpha|\sqrt{1-\zeta^2}}\right)$$
 (2.57)

where  $\zeta$  is the second order damping factor, and  $\omega_n$  is the loop natural frequency which is directly related to the loop bandwidth  $\omega_c$ . In general, the larger the loop bandwidth of a PLL, the shorter its locking time.

#### Power consumption

While achieving the desired performance, a frequency synthesizer PLL is required to have minimum power consumption. This is particularly important for portable devices where battery lifetime is a great concern. Minimizing the power consumption in a PLL can be in conflict with other desirable performance such as noise and frequency tuningrange [23]. Careful optimization of the design is required to concurrently achieve all the targeted specifications.

#### 2.3.2 PLL Behavioral Modeling

When designing a large system with a complex set of parameters such as the case in PLL design, transistor-level simulations of the entire system using SPICE are too slow and often impractical to verify some performance metrics. Therefore, a top-down design approach is often adopted instead. In this approach, the system is constructed of several building blocks with adjustable parameters that allow the overall system optimization using a system simulator such as Simulink. Next, the specifications of the individual building blocks are derived and the transistor-level circuits are designed and laid out. At last, the constructed system is simulated, if possible, at the transistor-level to verify functionality and/or performance.

It is very advantageous to verify the system functionality and predict its performance form the system level using behavioral models prior to transistor-level design of the individual blocks. This approach ensures that the system will perform as expected if the

```
'include
1
              disciplines.vams
\mathbf{2}
   module resistor (p, n);
   parameter real r=0; // resistance value in ohms
3
4
   inout p, n;
5
   electrical p, n;
6
   analog
7
   V(p,n) <+ r * I(p,n);
8
   endmodule
```

Listing 2.1: Linear resistor behavioral model in Verilog-A

individual blocks are designed as modeled in the system-level. This also helps circumvent the need to overdesign the individual blocks or, even worse, redesign the blocks if the overall system does not meet the targeted specifications. If it turns out that the specifications of a certain individual block are too stringent to achieve, one can return to the system-level and modify the block model to ensure that the specifications are achievable. Therefore, the models should be malleable to update and changes that come late in the design process. This allows studying and understanding the impact of such changes on the entire system, and can provide insight to possible reparation. The top-down design approach is not meant to dispose of transistor-level simulations entirely, but to reduce the overhead cost of lengthy simulations to verify each performance metric, thus increasing the effectiveness of the design process. The transistor-level simulations can still be used selectively in some cases such as verifying functionality, ensuring proper interface between some building blocks, and start-up conditions.

The ability to use high-level behavioral model and the transistor-level interchangeably as design blocks is an asset. Verilog-A is a widely used platform that allows combining the two representations in analog circuit design [24]. Analog components can be described on a behavioral-level in Verilog-A language using a set of concise statements with adjustable parameters. Models of analog components using Verilog-A language will be used frequently throughout this thesis. Therefore, an introduction to the basics of this language can serve to familiarize the reader with the language. More details regarding each individual block will be tackled separately each time we present a new model.

An example of an analog component described in Verilog-A is shown in Listing 2.1.

This simple model represents a linear resistor r whose behavior is governed by the relationship between the voltage across its terminals V(p, n) and the current flowing

through the resistor I(p, n), which is given by

$$V(p,n) = r \times I(p,n) . \tag{2.58}$$

In this model, **Line 1** instructs the code to include a file of disciplines. A discipline in Verilog-A language is a collection of related signal types. An example to that is the *electrical* discipline that contains voltages and currents. **Line 2** defines the basic building block, called *module*, by listing its name, *resistor*, and its ports, p and n. Ports are connection points of the component. In **Line 3**, a *parameter* r that represents the resistor value is given the type *real* and the default value of zero. Using the parameter statement allows the value to be specified when the module is instantiated. **Line 4** specifies the resistor ports type as *inout*, that is bidirectional, as opposed to the unidirectional types *input* and *output*. The definition of p and n as *electrical* in **Line 5** dictates that the signals at the ports are voltages or currents. The keyword *analog* in **Line 6** precedes an analog process that describes a continuous time relationship in the statement to come. The module behavior is described in **Line 7**. The operator < + signifies a *contribution statement* that forces the expression on the left side of the operator to be continuously equal to the expression on the right side. Finally, the *endmodule* keyword in **Line 8** terminates the module.

A more illustrative example of a Nyquist analog-to-digital converter (ADC) is shown in Listing 2.2, where the input of the ADC is a continuous-time analog signal and the converted output is a set of scaled quantized binary bits produced at each clock edge.

In this model, a bus [0:bits-1]out of integers is used to represent the output bits. The *genvar* integer variable is a special variable used as an index in *for* loops. The model introduces two important functions, *cross* and *transition*, that will be used frequently in the models that will be built in the chapters to come. The *cross* function initiates the conversion process each time the first argument crosses zero in the direction specified in the second argument; that is when the clock edge makes a transition and crosses the threshold in the positive direction specified by +1 in the second argument. Using -1 in the second argument creates the event when the first argument crosses zero in the first argument crosses zero in the negative direction, while using zero in the second argument creates an event when the first argument crosses is carried out in a *for* loop that compares each sampled input with the midpoint of the full-scale value, and produces the corresponding binary output. The *transition* function takes the binary constant values and generates a smooth waveform at the output to avoid discontinuity due to the abrupt changes in the binary values. The constant values go in the first

```
'include
           disciplines.vams
module adc (out, in, clk);
parameter integer bits = 8 from [1:24];// resolution in bits
parameter real fullscale = 1.0;// input range 0 to fullscale (V)
                               // delay from clock to output (s)
parameter real td = 0;
parameter real tt = 0;
                               // transition time of output (s)
                              // voltage level of logic 1 (V)
parameter real vdd = 5.0;
parameter real thresh = vdd/2; // logic threshold level (V)
parameter integer dir = 1; // +1/-1 for rising/falling edge
input in, clk;
output [0: bits -1] out;
voltage in, clk;
voltage [0: bits -1] out;
real sample, midpoint;
integer result [0: bits -1];
genvar i;
analog begin
@(cross(V(clk)-thresh, +1)  or (initial_step) begin
sample = V(in);
midpoint = fullscale /2.0;
for (i = bits -1; i \ge 0; i = i-1) begin
if (sample > midpoint) begin
result[i] = vdd;
sample = sample-midpoint;
end else begin
result[i] = 0.0;
end
sample = 2.0 * \text{sample};
end
end
for (i = 0; i < bits; i = i + 1) begin
V(out[i]) <+ transition(result[i], td, tt);
end
end
endmodule
```

Listing 2.2: A behavioral model of an analog-to-digital converter in Verilog-A

argument of the *transition* function, while the second argument represents the transition delay. The third argument represents the transition time; that is the rise time and fall time. The fourth argument is optional in case the rise time is not equal to the fall time. If used, the fourth argument represents the fall time, and the third argument represents the rise time only in this case. It is important to notice that the *transition* function maintains a history of the arguments it uses; that is the output at each event depends on the previous input to the function.

## 2.4 Transistor-Level Design

The PLL is a mixed-signal system that consists of a combination of analog and digital circuits. While the design of the digital part of the PLL can be automated; the analog counterpart is often the bottleneck of the design [25]. In digital design, the MOS transistor operates as a switch to render what is defined as "0" logic level or "1" logic level. The Boolean logic abstraction, determined by well-defined noise margins, allows the design of digital circuits using hardware description languages (HDLs). The complexity of the transistor level is circumvented and the design is carried out at higher abstraction levels (i.e. gates and registers). In contrast, analog circuits often operate between the voltage limits defined by the aforementioned logic-levels. Analog circuit design is performed at the transistor level, where each transistor is biased and sized accordingly to realize a desired performance. This involves a lot of tweaking and redesign in order to arrive at the desired performance. This results in an increase in the time-to-market and consequently the cost of creating the final design.

In this section, we present an approach to fully characterize a transistor in any CMOS technology using a set of dimension-independent parameters. The proposed approach creates a layer of abstraction that circumvents the SPICE transistor model complexity and replaces it with a set of normalized parameters. The approach allows for developing a unilateral sequence of steps to design analog circuits without further iteration or adjustment. The proposed approach combined with the designer's knowledge of the circuit operation facilitates the realization of a procedural methodology to solve for near optimal, if not optimal, circuit behavior. The required small and large-signal behavior is achieved, while the dc biasing at each node in the circuit is defined at the same time. The proposed approach is utilized to characterize NMOS and PMOS transistors in a 65 nm CMOS technology which will be used throughout this thesis.

#### 2.4.1 Evolution of MOS Transistor Modeling

Traditionally, long-channel devices were characterized in their strong inversion saturation region by the well-known square-law model [26] as

$$I_D = \frac{1}{2}\mu C_{ox} \frac{W}{L} (V_{GS} - V_{TH})^2 (1 + \lambda V_{DS})$$
(2.59)

where  $\mu$  is the electrons or holes mobility,  $C_{ox}$  is the oxide capacitance, and  $\lambda$  is the channel length modulation factor. The former equation gives good insight to the understanding and prediction of MOS transistor behavior. However, the square-law model is a very simplified form of the relationship between the current and the voltages at the MOS transistor terminals. A lot of second order factors are neglected in order to reach to this well-put simplified equation. As technology scaled down, the previously neglected non-idealities manifested more effectively which resulted in large deviations from the results obtained from the model. Attempts to include the effect of some non-idealities in the square-law model such as velocity saturation and mobility degradation have been reported [27], [28]. However, the further shrinking in technology aggravated the effect of these second-order non-idealities and rendered all the attempts futile. In addition, the square-law model and its modified forms assume sharp transition between the triode and saturation regions where strong inversion occurs. However, most of the time transistors need to operate at the edge of saturation for optimal performance where the transition in this region is in reality smooth and continuous. An attempt to incorporate the continuous transition between the two regions along with the MOS transistor non-idealities was presented in [29,30]. A single equation that describes the MOSFET behavior is given by

$$I_D = I_Z \left[ \ln^2 (1 + e^{\frac{V_P - V_{SB}}{2\phi_t}}) - \ln^2 (1 + e^{\frac{V_P - V_{DB}}{2\phi_t}}) \right], \qquad (2.60)$$

where  $I_Z$ ,  $V_P$  and n are given by:

$$I_Z = 2\mu C_{ox} \frac{W}{L} n \phi_t^2 ,$$

$$n = \left[ 1 - \frac{\gamma}{2\sqrt{V_{GB} - V_{TH_0} + (\frac{\gamma}{2} + \sqrt{2\phi_F})}} \right]^{-1}$$

where  $\phi_t$  is the thermal voltage kT/q,  $V_P$  is the pinch-off voltage and given by  $(V_{GB} - V_{TH_0})/n$  and all other symbols have their usual meanings.

Obviously, the model becomes more complicated and the designer loses insight as more non-idealities are incorporated. Currently MOS transistors are described in industry using software models that contain several hundred lines of code to include shortchannel effects, e.g. BSIM and EKV models [25,31]. Therefore, between the absence of a simple compact model and the complexity of an accurate model, the transistor characterization conundrum needs to be tackled.

Due to the complexity of equation-based transistor models [26, 27], a new trend towards a graphical approach has emerged. This was initiated by the  $g_m/I_D$  methodology presented by Silveira *et al* [28]. The proposed method is valid in all regions of operation of the MOS transistor. The methodology is based on the relationship between the transconductance efficiency  $g_m/I_D$  and the normalized current  $I_D/(W/L)$ . This relationship is unique for all transistors of the same type in a certain technology. Therefore, all process parameters (e.g.  $\mu, C_{ox}, \lambda$ ) are implicitly incorporated to characterize the transistor. However, the designer does not have to worry about these parameters during the design process. Although the  $g_m/I_D$  methodology offers a convenient tool towards design automation, its application to short-channel devices suffers from the following main shortcomings:

- The dependency of the generated graphs on the drain-to-source voltage  $V_{DS}$  is neglected. The  $g_m/I_D$  curve is often generated at  $V_{DS}$  around  $V_{DD}/2$ . However, even the small error in  $V_{DS}$  can impair the accuracy of the dc biasing point. This, in turn, leads to discrepancies between that predicted and that generated by simulation and affects the subsequent stages that may rely on the dc biasing of the preceding stage.
- The effect of the transistor parasitics is neglected, e.g. assuming that the load capacitance is much larger than the parasitic capacitance of the transistor. The intrinsic capacitance, if comparable to the load, may change the positions of the poles and zeros and affect the overall frequency response, especially in multi-pole systems.
- The voltage swing constraint at the input and the output of the transistor is not evaluated properly. In long-channel devices, the edge of saturation is determined by the overdrive voltage  $(V_{GS} - V_{TH})$ . However, this is very pessimistic for shortchannel devices. Owing to the velocity saturation phenomenon that occurs at high lateral electric field, the saturation in short-channel devices occurs at a much lower

value than the overdrive voltage, allowing for more voltage swing headroom at the output.

• Accurate prediction of the noise performance of the system is not introduced. The noise requirements are very stringent in some analog and RF circuit applications and require prior prudence.

#### 2.4.2 The MOS Transistor Normalized Parameters

The premise of the  $g_m/I_D$  methodology is the relationship between  $g_m/I_D$  ratio and the normalized current  $I_D/W$ . This facilitates the prediction of the small signal parameters under the assumption that the current is linearly proportional to the transistor width W. This can be mathematically expressed as:

$$I_D = W I_{D_N} \tag{2.61}$$

where  $I_{D_N}$  is the normalized-to-width transistor drain-current, and it is a function of  $V_{GS}$ ,  $V_{DS}$ , and L. Similarly, it can be deduced that the small signal parameters  $g_m$  and  $g_{ds}$  are also linearly proportional to the transistor width W for a given biasing, i.e.,

$$g_m = \frac{\partial I_D}{\partial V_{GS}} = \frac{\partial (WI_{D_N})}{\partial V_{GS}} = W \frac{\partial I_{D_N}}{\partial V_{GS}}$$
(2.62)

$$g_{ds} = \frac{\partial I_D}{\partial V_{DS}} = \frac{\partial (WI_{D_N})}{\partial V_{DS}} = W \frac{\partial I_{D_N}}{\partial V_{DS}}$$
(2.63)

where  $g_m$  and  $g_{ds}$  are the transconductance and the output conductance of the transistor, respectively. Similarly, the main transistor capacitive parasitic components ( $C_{gs}$ ,  $C_{db}$  and  $C_{gd}$ ) are also linearly proportional to the transistor width W. The proportionality is valid in all regions of operation in both long and short-channel devices for the aforementioned parameters. The correlation coefficient R of the linearity of each of these parameters and the width is verified through simulation to be for all intents and purposes equal to unity. Therefore, the normalized parameters ( $g_m/W$ ,  $g_{ds}/W$ ,  $I_D/W$ ,  $C_{gs}/W$ ,  $C_{gd}/W$ ,  $C_{db}/W$ ) can be used independently of the transistor width as circuit design tools. Furthermore, the resulting relationships are unique for all transistors of the same type from the same batch. Although the drain-to-bulk capacitance  $C_{db}$  may show some variation with the number of fingers, usually this effect is small. In addition, as technology scales down, the oxide capacitance  $C_{ox}$  increases, and so do  $C_{gs}$  and  $C_{gd}$ ; while the junction capacitance  $C_{db}$  decreases. For example, a minimum-feature NMOS transistor with a



Figure 2.15: Examples of generated normalized parameters at  $L = L_{min}$ 



Figure 2.16: Transistor model

width of 1  $\mu$ m in 65 nm CMOS technology has  $C_{gd}$  more than 1000 times larger than  $C_{db}$ . Therefore, the total capacitance at the transistor drain is dominated by  $C_{gd}$  (i.e.  $C_{drain} = C_{gd} + C_{db} \simeq C_{gd}$ ).

Another point to contend with is the previously mentioned fact that the  $g_m/I_D$  ratio is a function of  $V_{GS}$  and  $V_{DS}$  of the transistor, as well as the transistor L. Clearly, neglecting  $V_{DS}$  will result in an arbitrary error that renders the drain voltage undetermined. Tracing back the transistor behavior in terms of node voltages and current, while making the transistor width W the dependent variable will ensure the drain-source voltage is set correctly.

The voltage input-referred thermal noise of the MOSFET channel is given from surface-potential-based model [29, 30] by

$$S_{v_n}(f) = 4kT\gamma/g_{d_0} \tag{2.64}$$

where  $g_{d_0}$  is the output conductance at  $V_{DS}=0$ , and  $\gamma$  is often referred to as the white



Figure 2.17: The MOS transistor characterization tool

noise gamma factor. For long channel devices,  $g_{d_0}$  is equivalent to  $g_m$  [30]. The theoretical long-channel value of  $\gamma$  is 2/3. However, in modern sub-micron technologies this factor rises to 1 or 2 due to velocity saturation and channel-length modulation [30]. Therefore,  $\gamma$  is dependent on channel length and region of operation (determined by dc biasing), but independent of the transistor width. On the other hand,  $g_{d_0}$  is a special case of  $g_{ds}$  when  $V_{DS}=0$ . Therefore, it is also linearly proportional to the transistor width and dependent on the dc biasing. The two terms  $\gamma$  and  $g_{d_0}$  can be lumped into one term  $g_{d_{eq}}$  that is linearly proportional to the transistor width and dependent on the dc biasing, where

$$g_{d_{eq}} = g_{d_0} / \gamma \;.$$
 (2.65)

The normalized  $g_{d_{eq}}$  will be evaluated at the different biasing conditions, and will also be used to incorporate the noise requirement in the design. To collect the normalized set of parameters for a specific type of transistor of a fixed length in a specific technology,  $V_{GS}$  and  $V_{DS}$  are swept from 0 to  $V_{DD}$  with a reasonable step size. The transistor width W is set to any arbitrary value. The proportionality will hold as long as the width is large enough to avoid border effects [31]. A set of parameters are extracted and normalized by the transistor width. A 3-D illustration example of the normalized transconductance and normalized  $C_{gs}$  as function of  $V_{GS}$  and  $V_{DS}$  is shown in Fig. 2.15. Analog circuit designers usually fix the channel length to (2-3) times the minimum channel length to avoid severe short-channel effects such as threshold variation, and to improve the output impedance of the transistor, which increases proportionally to  $L^2$ . However, another dimension can be augmented to incorporate different channel lengths. This does not add much complexity to the modeling as practical constraints limit the allowable channel length to some discrete values rather than a continuous range as in the case of  $V_{GS}$ and  $V_{DS}$ . Using a multiple of some small dimension is a reasonable approach [32], i.e.  $L = L_{min}, L = 1.5L_{min}, L = 2L_{min}, \ldots, L = 3.5L_{min}$ .

Effectively, a transistor  $T_x$  of a particular type in a particular technology can be envisioned as a black box where the dc biasing and the channel length represent the input and the normalized parameters represent the output. The normalized-to-width parameters as well as  $V_{D_{sat}}$  are shown in Fig. 2.16, where  $V_{D_{sat}}$  is a function of  $V_{GS}$  only. The extracted parameters are evaluated by the transistor model used by the simulator without the need to directly deal with the model complexity. A simple tool with the graphical-user-interface (GUI) shown in Fig. 2.17 was built. The tool is used throughout the thesis to characterize MOS transistors that will be used in designing different PLL components such as VCO, CP, prescalar, and voltage buffers.

## 2.5 Summary

In this chapter, we covered several topics that were deemed as necessary tools in our topdown approach to PLL design. First, an overview of the PLL from a system perspective was presented. The behavior and representation of the main building blocks of the PLL were discussed from a system point of view. Key parameters and definitions to quantify the performance of the PLL were described. Noise sources, representation, and optimization in PLL design were discussed in further details. Next, we briefly discussed the system-level specifications of the PLL and how these specifications can be mapped into circuit parameters. A short introduction to Verilog-A language was presented to enable behavioral modeling of the PLL building blocks in the chapters to come. To close the design cycle, we finally developed a simple tool to characterize the MOS transistor using a set of normalized parameters. This tool will be used in the transistor-level design of the CMOS circuits used throughout the following chapters.

## Chapter 3

# PLL Building Blocks: Circuit Design and Behavioral Models

In this chapter, we discuss the design of the main building blocks of the PLL. General behavioral models in Verilog-A will be demonstrated for each block for the use of systemlevel simulations. Examples of transistor-level circuit implementation of each component are discussed.

## 3.1 Voltage-Controlled Oscillator

A voltage-controlled oscillator (VCO) is one of the most crucial components of a PLL system. Ideally, a VCO takes a control voltage  $V_{CTRL}$  as an input and generates an output frequency that is dependent on the input. Different circuit implementations of the VCO may result in different noise performance, power consumption, area size, and frequency range. Therefore, deep understanding of the trade-offs between the different VCO architectures and circuit implementations is very important to meet the design specifications.

## 3.1.1 VCO Types

Different VCO types are used in different PLL applications based on the system requirements. In this section, we briefly discuss some of these VCO types, their circuit implementation, and their main advantages and disadvantages.



Figure 3.1: The concept of multivibrator oscillator

#### Multivibrator (Relaxation) Oscillator

A multivibrator (or relaxation) oscillator in its simplest form is constructed by periodically charging and discharging a capacitor using a current source [33]. The concept is illustrated in Fig. 3.1 where an active circuit generates a current  $I_{CH}$  that changes its polarity periodically causing the voltage on the capacitor to switch between  $V_P^+$  and  $V_P^-$ . The frequency of oscillation  $f_o$  is given by

$$f_o = \frac{I_{CH}}{2CV_{PP}} \tag{3.1}$$

where  $V_{PP} = V_P^+ - V_P^-$ . Frequency tuning can be achieved by varying one of the parameters of the equation.

A circuit implementation example is shown in Fig. 3.2. A multivibrator oscillator based on Schmitt trigger comparator uses both positive feedback and negative feedback loops to cause periodic oscillation at the output. Assume that the voltage at both inputs (V+ and V-) and the output  $V_{out}$  is initially zero. If any noise causes the input voltage V+ to go above zero, then due to the bistable nature of the circuit the output will saturate to  $V_{DD}$  causing the voltage V+ to saturate at  $V_{DD}/2$ . At the same time, the output voltage will charge the input voltage V- through the RC circuit causing the voltage V- to go above  $V_{DD}/2$ . This means that the input voltage V- is now greater than the input voltage V- which causes the output voltage to flip and saturate at  $-V_{DD}$  and the input voltage V- will saturate at  $-V_{DD}/2$ , and the cycle continues. The frequency of oscillation is obtained by differential equation analysis of the circuit [34] and is given by

$$f_o = \frac{1}{2\ln(3)RC}.$$
 (3.2)



Figure 3.2: A Schmitt trigger multivibrator oscillator

The main advantage of this type of VCOs is its simplicity. It can be implemented using discrete or integrated components without occupying large area. However, this type of VCOs has usually inferior noise performance [35] and its maximum frequency is limited by the speed of the op-amp. When noise performance or high frequency operation is demanded, other VCO types must be considered.

#### **Ring Oscillator**

A ring oscillator is often built from a chain of odd number of inverters as shown in Fig 3.3(a) where the output of the last inverter is fed back into the input of the first inverter. A single-ended inverter stage is shown in Fig 3.3(b). A ring oscillator with differential stages can also be implemented to reduce the effect of supply and substrate noise at the output [36]. The use of differential stages allows the use of even number of stages if the polarity of one stage is reversed. The frequency of oscillation depends on the number of stages N and the propagation delay  $t_p$  of each stage, i.e.  $f_o = 1/(2Nt_p)$ . The propagation delay of a single stage driving a capacitive load can be estimated as

$$t_p = \frac{C_L V_{TH}}{I_D} \tag{3.3}$$

where  $V_{TH}$  is the threshold voltage which is usually defined at  $V_{DD}/2$ .

Thus, the oscillation frequency can be expressed as

$$f_o = \frac{I_D}{N \ C_L \ V_{DD}} \ . \tag{3.4}$$



Figure 3.3: A ring oscillator: (a) architecture (b) single-ended stage inverter

Tuning can be achieved by varying any of these parameters; namely the driving current, the load capacitance, the number of stages, or the supply voltage. Ring oscillators can operate at very high frequency, provide wide tuning range, occupy small area size, and consume low power. These advantages make ring oscillators the popular choice in many applications especially in digital and microprocessor applications. Nevertheless, despite their enhanced noise performance compared to multivibrator oscillators, the phase noise of ring oscillator does not meet the stringent requirement of most RF standards and applications.

#### LC Tank

In an LC-tank oscillator, the oscillation frequency is determined by the resonance of the LC tank, i.e.  $f_0 = 1/(2\pi\sqrt{LC})$ . The concept of LC oscillators is depicted in Fig. 3.4. The tank has losses due to parasitics associated with non-ideal inductors and capacitors. These losses are modeled with a finite resistance in parallel with the tank. An active circuitry is used to outweigh the losses by providing an effectively negative resistance to sustain the oscillator. LC oscillators, in general, have superior noise performance compared to other oscillator types which makes them widely preferred in applications that require low jitter such as RF transceivers. Tuning is usually achieved by varying the capacitance of the tank.



Figure 3.4: The concept of LC oscillators

There are different circuit implementations to the LC oscillator shown in Fig. 3.4. A pair of cross-coupled NMOS or PMOS transistors is often used to generate a negative resistance equal to  $-2/g_m$ . A combination of NMOS and PMOS cross-coupled transistors can be used to create the complementary CMOS LC oscillator architecture shown in Fig. 3.5 with the equivalent small-signal model of the tank. This architecture is widely used due to its high transconductance efficiency and superior noise performance [37], [38]. A biasing tail current-source is used to limit the current drawn from the supply, thus limiting the oscillation amplitude and the power consumption. The complementary CMOS LC oscillator generates larger negative resistance for the same biasing current compared to a single cross-pair LC oscillator. In addition, utilizing the architecture symmetry and using PMOS tail current-source allow further reduction of the noise in the  $1/f^3$  region [37]. This architecture can be optimized for wide tuning range while maintaining the phase noise at low level. The main disadvantage of LC VCOs is the use of inductors which results in large area size.

Since the circuit is symmetric around its vertical axis, and since the varactors  $C_v$  are usually connected in series to share the control voltage, one can write:

$$L_{tank} = 2L \tag{3.5}$$

$$2C_{tank} = C_v + C_L + C_{par,p} + C_{par,n} \tag{3.6}$$

where  $C_L$  is the load capacitance from the next buffer stage, and  $C_{par,n(p)} = C_{gs,n(p)} + C_{db,n(p)} + 4C_{gd,n(p)}$  where  $C_{gs}$ ,  $C_{db}$ , and  $C_{gd}$  are the transistor parasitic capacitances.



Figure 3.5: (a) Complementary CMOS LC-VCO and (b) equivalent small-signal model

The tank losses due to the varactor and inductor can be combined in one parallel conductance  $g_{tank}$ , where:

$$2g_{tank} = g_L + g_v \tag{3.7}$$

where  $g_L$  and  $g_v$  are the losses equivalent conductance of the inductor and varactor, respectively.

The negative conductance generated by the cross coupled active devices  $g_{act}$  must compensate for the tank losses to sustain oscillation. Usually, some safety oscillation factor ( $k_{osc} \approx 2-3$ ) is considered to account for variations, where:

$$g_{act} = k_{osc} \times g_{tank} \tag{3.8}$$

The tank differential voltage amplitude is given by the approximate relationship:

$$V_{out} \cong 2I_{tank}/g_{tank} \tag{3.9}$$

#### 3.1.2 Phase Noise

In this section, we discuss the sources of phase noise in an oscillator. The characterization of the phase noise profile of an oscillator is explained in the light of two widely used models; namely Leeson's model and Hajimiri's model.

#### Leeson's Model

An LC-oscillator, in general, consists of a lossy LC-tank resonator and an active circuit to compensate the resonator losses. In order to predict the noise at the output voltage of the tank, the general LC-oscillator architecture shown in Fig. 3.4 is considered. The impedance of the LC-tank for small offset frequency  $\Delta \omega$  from the center frequency  $\omega_0$ can be approximated by

$$|Z(\omega_0 + \Delta\omega)| \simeq j \frac{\omega_0 L}{2\frac{\Delta\omega}{\omega_0}}$$
(3.10)

and since the quality factor Q of the tank can be written as

$$Q = \frac{R_p}{\omega_0 L} , \qquad (3.11)$$

one can rewrite the tank impedance as

$$|Z(\omega_0 + \Delta\omega)| \simeq \frac{R_p}{2Q} \frac{\omega_0}{\Delta\omega_0}$$
(3.12)

where  $R_p = 1/g_{tank}$  is the equivalent parallel resistance of the tank due to the tank losses.

There are two main sources of noise in this model. The first one is the thermal noise of the tank parallel resistance  $R_p$  which can be represented as a parallel current source with mean-square spectrum density of  $i_n^{-2}/\Delta f = 4kT/R_p$ . The second noise source is the active circuit that provides the negative resistance. The noise of the active circuit includes both thermal noise and flicker noise. It is customary in the Leeson's model to combine the thermal noise of the tank resistance and the active circuit into one equivalent noise source expressed as

$$\frac{\bar{i_n}^2}{\Delta f} = \frac{4kTF}{R_p} \tag{3.13}$$

where F is called the excess noise factor due to active devices.

Leeson's model assumes that an oscillator is a linear time-invariant (LTI)system.

Thus, the mean-square noise voltage at the output of the oscillator can be obtained as

$$\frac{\bar{v_n}^2}{\Delta f} = \frac{\bar{i_n}^2}{\Delta f} \times |Z(\omega_0 + \Delta \omega)|^2 = 4kTFR_p \left(\frac{1}{2Q}\frac{\omega_0}{\Delta \omega_0}\right)^2$$
(3.14)

This relationship is valid as long as the LTI system assumption holds, which allows multiplication in the frequency domain. The thermal noise expression obtained in Eq. (3.14) include both amplitude and phase noise-power, which are equal at equilibrium. Since most practical oscillators deploy an amplitude limiting mechanism, the effect of amplitude noise is usually negligible and the total noise power is dominated by the phase noise power which is half of that obtained in Eq. (3.14). Taking this into consideration and normalizing the result to the carrier signal power  $P_{sig}$ , one can write the normalized single-sideband noise spectral density as

$$\mathcal{L}\left\{\Delta\omega\right\} = 10\log\left[\frac{2kTF}{P_{sig}}\left(\frac{1}{2Q}\frac{\omega_0}{\Delta\omega_0}\right)^2\right]$$
(3.15)

which is often referred to as *phase noise*, and has units of dBc/Hz. As evident from Eq. (3.15), one can decrease phase noise by improving the tank quality factor or by increasing the output voltage swing to increase the carrier signal power.

The expression in Eq. (3.15) represents the  $1/(\Delta \omega)^2$  region in the oscillator phase



Figure 3.6: A typical phase noise profile of an oscillator

noise profile shown in Fig. 3.6. Measurements of practical oscillators shows that the measured phase noise spectrum flattens out at large frequency offsets from the carrier creating a noise floor that is due to the reduced loop gain of the oscillator positive feedback at high frequency or due to added noise from output buffer or measurement instrumentation. In addition, the expression in Eq. (3.15) neglects the flicker noise of the active circuit. When added to the total noise, the flicker noise causes the phase noise spectrum to increase for small frequency offsets creating a  $1/(\Delta \omega)^3$  region. The frequency value at which the  $1/(\Delta \omega)^2$  region and the  $1/(\Delta \omega)^3$  region intersect is referred to as the  $\Delta \omega_{1/f^3}$  corner frequency. When these factors are taken into consideration, the expression in Eq. (3.15) can be modified to

$$\mathcal{L}\left\{\Delta\omega\right\} = 10\log\left[\frac{2kTF}{P_{sig}}\left\{1 + \left(\frac{1}{2Q}\frac{\omega_0}{\Delta\omega_0}\right)^2\right\}\left(1 + \frac{\Delta\omega_{1/f^3}}{\Delta\omega}\right)\right]$$
(3.16)

where the unity factor in the term inside the curly braces accounts for the noise floor, and the term in the second parentheses accounts for the behavior in the  $1/(\Delta \omega)^3$  region.

#### Hajimiri's Model

Although Leeson's model provides a powerful insight to major sources of noise in an oscillator, the model fails to predict the measured oscillator noise spectrum accurately. This is due to limitations of the model imposed by the assumptions made through the derivation. First, the assumption of a time-invariant system is not justified. As verified in [38], oscillators are time-variant systems. Second, the excess noise factor F is difficult to predict and is usually considered an empirical fitting parameter. Third, the corner frequency  $\Delta \omega_{1/f^3}$  is assumed to be equal to the 1/f corner frequency of the active device. However, such assumption is baseless and fails to provide a reasonable prediction in the presence of more than one flicker noise source in the active circuit of the oscillator. Finally, the derivation was based on an LC oscillator model. Although the concept is usually extended to other oscillator types (e.g. ring oscillators), the analogy is not clear and a general theory to explain noise in oscillators is needed.

Hajimiri and Lee [38] provide a general theory of phase noise in electrical oscillators based on the assumption of a linear time-variant (LTV) system. The theory scrutinizes the major assumptions of both linearity and time-invariance adopted by Leeson's model. In reality, oscillators are amplitude-limiting components which make them inherently non-linear. Nevertheless, it can be proven by simulations and experiment that the relation between injected noise and *output phase* is, in fact, linear [38]. In other words, if



Figure 3.7: LC oscillator: (a) excitation with a current impulse (b) impulse response

noise perturbation is injected into the oscillator and the amount of noise is much smaller in magnitude than the carrier, the linearity assumption between noise and output phase still holds.

On the other hand, the assumption that oscillators are time-invariant systems is baseless. Fig. 3.7 shows how the output waveform of an LC oscillator responds to noise disturbance injected at two different times. If the current impulse occurs at the zero crossing, the output phase shift is maximized while the amplitude remains constant. However, if the current impulse is injected when the voltage amplitude is maximum the amplitude will increase but the output phase will not be affected. Therefore, the amount of phase shift at the output depends on the time the noise disturbance is injected. Thus, it is reasonable to conclude that an oscillator is a linear time-variant (LTV) system where the output phase depends on the time the noise is injected into the system.

An LVT system can still be characterized by its impulse response. Therefore, the relation between the output phase shift  $\phi(t)$  and an input impulse  $i(\tau)$  can be written as

$$\phi(t) = \int_{-\infty}^{t} h_{\phi}(t,\tau) \ i(\tau) d\tau , \qquad (3.17)$$

where  $h_{\phi}(t,\tau)$  is the impulse response function.

Knowing that an impulse input generates a step change at the output phase, one can

rewrite the impulse response function as

$$h_{\phi}(t,\tau) = \frac{\Gamma(\omega_0 \tau)}{q_{max}} u(t-\tau)$$
(3.18)

where u(t) is the unit step function,  $q_{max}$  is the maximum change in the charge on the output capacitor, and  $\Gamma(x)$  is the impulse sensitivity function (ISF) and is a dimensionless frequency-and-amplitude-independent function periodic in  $2\pi$ . The output phase relation can be expressed as

$$\phi(t) = \frac{1}{q_{max}} \int_{-\infty}^{t} \Gamma(\omega_0 \tau) \ i(\tau) d\tau \ . \tag{3.19}$$

Since the ISF is periodic, it can be expressed as a Fourier series

$$\Gamma(\omega_0 \tau) = \frac{c_0}{2} + \sum_{n=1}^{\infty} c_n \cos(n\omega_0 \tau + \theta_n)$$
(3.20)

where  $c_n$  are the real coefficients of the series and  $\theta_n$  can be neglected if the noise sources are uncorrelated.

Substituting Eq. (3.20) into Eq. (3.19), we obtain

$$\phi(t) = \frac{1}{q_{max}} \left[ \frac{c_0}{2} \int_{-\infty}^t \Gamma(\omega_0 \tau) \ i(\tau) \ d\tau + \sum_{n=1}^\infty c_n \int_{-\infty}^t \Gamma(\omega_0 \tau) \ i(\tau) \cos(n\omega_0 \tau) \ d\tau \right] . \quad (3.21)$$

The ISF can be found by injecting a current impulse at different phase shifts from 0 to  $2\pi$  and measure the resulting output phase difference  $\Delta \phi$  at steady state. For a narrow impulse where the injected charge  $\Delta Q$  is equal to the area of the impulse, i.e.  $\Delta Q = i(\tau)d\tau$ , the ISF can be expressed as

$$\Gamma(\omega_0 \tau) = \frac{Q_{max}}{\Delta Q} \ \Delta \phi \ . \tag{3.22}$$

The ISF depends on the oscillator topology and the shape of the output waveform. Fig. 3.8 shows typical output voltage waveforms and their corresponding ISF waveforms for an LC oscillator and a ring oscillator.

Once the ISF is evaluated and the output phase is calculated, the next step is to determine the effect of this phase shift on the output signal waveform which is clearly a non-linear relation. A phase shift  $\Delta \phi$  results in an output waveform expressed as

$$v_{out}(t) = \cos\left[\omega_0 t + \phi(t)\right] , \qquad (3.23)$$



Figure 3.8: Typical ISF of (a) LC oscillator (b) ring oscillator

where A is the amplitude of the output waveform.

Therefore, a complete system representation of an oscillator responding to noise injection consists of a cascade of two blocks as shown in Fig. 3.9. The first block represents a linear time-variant system that converts the input noise into phase shift, while the second block is a non-linear system that translates the phase shift into voltage representation at the output waveform.

Eq. (3.21) implies that injecting a sinusoidal whose frequency is near any multiple k of the oscillation frequency  $\omega_0$ , i.e.

$$i(t) = I_k \cos\left[(k\omega_0 + \omega_m)t\right] \tag{3.24}$$

will result in two equal sidebands at an offset  $\omega_m$  even though the injection occurs near some integer multiple of the oscillation frequency. This can be verified by substituting Eq. (3.24) into Eq. (3.21). For  $\omega_m \ll \omega_0$  and k = n, one can write

$$\phi(t) = \frac{I_k c_k}{2q_{max}\omega_m} \sin(\omega_m t) . \qquad (3.25)$$

If the output waveform is a sinusoidal, the transformation from phase to voltage is described by the non-linear sinusoidal equation

$$v_{out} = \cos\left[\omega_0 t + \phi(t)\right] . \tag{3.26}$$



Figure 3.9: Oscillator as a cascade of two systems

Phase noise is defined as the single-sideband noise power with respect to the carrier which for a *single tone* injection is given by

$$\mathcal{L}\left\{\omega_{m}\right\} = 10\log\left(\frac{I_{k} c_{k}}{4q_{max}\omega_{m}}\right)^{2} . \qquad (3.27)$$

If a single tone injection is replaced with a thermal white noise source, the resulting phase noise becomes

$$\mathcal{L}\left\{\omega_{m}\right\} = 10\log\left(\frac{\frac{\overline{i_{k}^{2}}}{\omega_{m}}\sum_{k=1}^{\infty}c_{k}^{2}}{4q_{max}^{2}\omega_{m}^{2}}\right) .$$

$$(3.28)$$

Using Parserval's theorem, we can write

$$\sum_{k=0}^{\infty} c_k^2 = \frac{1}{\pi} \int_0^{2\pi} |\Gamma(x)|^2 \, dx = 2\Gamma_{rms}^2 \,, \tag{3.29}$$

which reduces the phase noise expression in the  $1/f^2$  region to

$$\mathcal{L}\left\{\omega_{m}\right\} = 10\log\left(\frac{\frac{i_{k}^{2}}{\omega_{m}}\Gamma_{rms}^{2}}{2q_{max}^{2}\omega_{m}^{2}}\right) .$$
(3.30)

where  $\Gamma_{rms}$  is the rms value of the ISF.

To include the flicker noise, assume the noise source has the following 1/f noise shape

$$i_{k,1/f}^{2^-} = \bar{i}_k^2 \cdot \frac{\omega_{1/f}}{\omega_m}$$
 (3.31)


Figure 3.10: Conversion of flicker and thermal noise into phase noise

where  $\omega_{1/f}$  is the 1/f corner frequency. Using Eq. (3.30), the phase noise in the  $1/f^3$  region can be expressed as

$$\mathcal{L}\left\{\omega_{m}\right\} = 10\log\left(\frac{\frac{i_{k}^{2}}{\omega_{m}}c_{0}^{2}}{8q_{max}^{2}\omega_{m}^{2}}\cdot\frac{\omega_{1/f}}{\omega_{m}}\right).$$
(3.32)

The  $1/f^3$  corner frequency in the phase noise profile of the oscillator can be found by equating Eqs. (3.30) and (3.32) which results in

$$\omega_{1/f^3} = \omega_{1/f} \cdot \frac{c_0^2}{4\Gamma_{rms}^2} = \omega_{1/f} \cdot \left(\frac{\Gamma_{dc}}{\Gamma_{rms}}\right) .$$
(3.33)

where  $\Gamma_{dc}$  is the dc value of the ISF.

The interesting result of this equation shows that the  $1/f^3$  corner frequency is not



Figure 3.11: Generation of VCO signal using the *idtmod* function

necessarily the same as the 1/f corner frequency as was assumed in Leeson's model. More interestingly, the  $1/f^3$  corner frequency can be reduced by reducing  $\Gamma_{dc}$  which can be achieved through controlling the rise- and fall-time symmetry.

Fig. 3.10 summarizes the mechanism by which white noise and flicker noise are downconverted such that the phase noise around the carrier frequency  $\omega_0$  is the summation of not only the noise close to  $\omega_0$  but also all the noise components in the vicinity of  $m\omega_0$ ; each is weighted by  $c_m$ . The flicker noise gets eventually upconverted and weighted by  $c_0$ to form the  $1/f^3$  noise near the carrier. Hajimiri's model is considered the most accurate and most reliable in understanding the mechanism of phase noise in oscillators.

## 3.1.3 Behavioral Modeling in Verilog-A

In order to examine the VCO performance at the system level, a behavioral model that describes the VCO operation should be developed. In this section, we discuss the basic behavioral model of a VCO using Verilog-A language. A VCO generates an output waveform whose frequency is a function of the input voltage, i.e.  $f_{out} = K_{VCO} \cdot V_{CTRL}$ , where  $K_{VCO}$  is the VCO gain in rad/V or Hz/V. In case of a linear VCO,  $K_{VCO} = (f_{max} - f_{min})/(V_{max} - V_{min})$  where  $f_{min}$  and  $f_{max}$  are the minimum and maximum frequency range limits, and  $V_{max}$  are the minimum and maximum control-voltage range limits.

A VCO can be modeled in continuous time by integrating the input control voltage over time to generate the phase of the output signal [24], i.e.

$$\phi(t) = 2\pi \int K_{VCO} V_{CTRL}(t) dt . \qquad (3.34)$$

The output voltage waveform can be generated directly from the phase using a sine function, i.e.

$$v_{out} = \sin\left[\phi(t)\right] , \qquad (3.35)$$

```
'include "constants.vams"
'include "disciplines.vams"
module vco (out, in);
parameter real Vmin=0;
parameter real Vmax=Vmin+1 from (Vmin:inf);
parameter real Fmin=1 from (0:inf);
parameter real Fmax=2*Fmin from (Fmin:inf);
parameter real ampl=1; // output amplitude
                                                input in; output out;
voltage out, in;
real freq, phase;
analog begin
//compute the freq from the input voltage
freq = (V(in) - Vmin) * (Fmax - Fmin) / (Vmax - Vmin) + Fmin;
//phase is the integral of the freq modulo 2*pi
phase = 2*'M_PI*idtmod(freq, 0.0, 1.0, -0.5);
//generate the output
V(out) \ll ampl*sin(phase);
\mathbf{end}
endmodule
```

Listing 3.1: Verilog-A model of a sinusoidal VCO



Figure 3.12: Output of the *idtmod* function

Fig. 3.11 depicts the steps used in the VCO model to generate the continuoustime output voltage waveform. A Verilog-A model that generates a sinusoidal output waveform is shown in Listing. 3.1.

The code utilizes the *idtmod* function to convert the frequency into phase [24]. The *idtmod* function combines time integration and modulus operation to ensure that the phase is limited to  $2\pi$ . The first four arguments of the *idtmod* function take the integrand x(t), initial condition *I.C.*, modulus *m*, and offset *O.S.*; and generates an output given as

$$y(t) = \left[ \left( \int_0^t x(\tau) \ d\tau + I.C. - O.S. \right) \right] \mod m + O.S .$$
 (3.36)

where y(t) is bound between O.S. and O.S. + m. The operation of the *idtmod* function is depicted in Fig. 3.12.

```
'include "constants.vams"
'include "disciplines.vams"
module vco (out, in);
parameter real Vmin=0;
parameter real Vmax=Vmin+1 from (Vmin:inf);
parameter real Fmin=1 from (0:inf);
parameter real Fmax=2*Fmin from (Fmin:inf);
parameter real Vlo= 1, Vhi=1; // min & max amplitude
parameter real tt=0.01/Fmax from (0:inf); // transition time
parameter real ttol=1u/Fmax from (0:1/Fmax); //tolerance time
input in; output out;
voltage out, in;
real freq, phase, Vout;
analog begin
//compute the freq from the input voltage
freq = (V(in) - Vmin) * (Fmax - Fmin) / (Vmax - Vmin) + Fmin;
//phase is the integral of the freq modulo 2*pi
phase = 2*'M_PI*idtmod(freq, 0.0, 1.0, -0.5);
// detect the threshold crossings at +pi/2 and -pi/2
@(cross(phase+'M_PI/2, +1, ttol)) begin
Vout = Vhi;
end
@(cross(phase+'M_PI/2, +1, ttol)) begin
Vout = Vlo;
end
//generate the output
V(out) <+ transition (Vout, 0, tt);
end
endmodule
```

Listing 3.2: Verilog-A model of a square-waveform VCO

A similar VCO model that generates an output square waveform is shown in Listing. 3.2 [18]. The cross function is used to detect the time at which the phase crosses the  $+\pi/2$  and the  $-\pi/2$  thresholds. The transition function generates a 50% duty-cycle square waveform whose time period is equal T where  $T = 1/f_{out}$ .

One of the main advantages of system-level behavioral modeling approach using Verilog-A language is that it allows incorporating the jitter performance of the VCO in the model. As was shown in Chapter 2, oscillator noise van be viewed in the frequency domain as phase noise around the oscillation frequency. In the absence of flicker noise [18], [19], the cycle jitter and the phase noise are related as

$$\sigma_c = \sqrt{\phi_n^2(\omega_m) \frac{\omega_m^2}{\omega_o^3}} \tag{3.37}$$

where  $\phi_n^2(\omega_m)$  is the phase noise magnitude in rad<sup>2</sup>/Hz at an offset frequency  $\omega_m$  from the oscillation frequency  $\omega_o$ .

The VCO model shown in Listing 3.3 generates a square waveform with the effect of the jitter included in the waveform transitions [18].

When the expression inside the cross function is zero, the transition is made and the jitter is updated. Assume a square-wave with a 50% duty-cycle, then the time interval of the high-logic is equal to that of the low-logic and is equal  $\Delta T/2$ . The jitter is updated every interval and added to the two transitions of the output square waveform at each time interval, where the cycle jitter is distributed over the two time intervals, i.e.  $T_i/2 = T/2 + \Delta T_i/2$  where the variance in the time interval is related to the time period and the cycle jitter as

$$var\left(\frac{\Delta T}{2}\right) = \frac{var(T)}{2} = \frac{\sigma_c^2}{2} . \tag{3.38}$$

Therefore, we can write

$$\left(\frac{\Delta T}{2}\right)^2 = \frac{\sigma_c^2}{2} \times \delta_i^2 \tag{3.39}$$

where  $\delta_i$  is a zero-mean unit-variance Gaussian random process, and is generated at each *cross* statement to determine the exact time the phase crosses the threshold.

Thus, the variation in the time period is given by

$$\Delta T_i = \sqrt{2} \sigma_c \delta_i . \tag{3.40}$$

The output frequency after adding the jitter becomes  $f_{out_i} = 1/(T + \Delta T_i)$ . By substituting Eq. (3.40), the dithered output frequency is

$$f_{out_i} = \frac{f_{out}}{1 + \sqrt{2}.\sigma_c.\delta_i.f_{out}}.$$
(3.41)

A similar expression can be derived for sinusoidal output VCOs with the 2 factor is replaced with 1 since the sine function is a continuous function and the jitter can be

```
'include "constants.vams"
'include "disciplines.vams"
module vco (out, in);
input in; output out; voltage out, in;
parameter real Vmin=0;
parameter real Vmax=Vmin+1 from (Vmin:inf);
parameter real Fmin=1 from (0:inf);
parameter real Fmax=2*Fmin from (Fmin:inf);
parameter real ratio=1 from (0:inf);
parameter real Vlo= 1 , Vhi=1;
parameter real tt = 0.01/Fmax from (0:inf);
parameter real jitter=0 from [0:0.25/Fmax); // VCO cycle jitter
parameter real ttol=1u * ratio / Fmax from (0 / Fmax);
parameter real outStart=inf from (1/Fmin:inf);
{\bf real} \ {\rm freq} \ , \ {\rm phase} \ , \ {\rm dT} \ , \ {\rm delta} \ , \ {\rm prev} \ , \ {\rm Vout} \ ;
integer n, seed, fp;
analog begin
@(initial_step) begin
seed = -561;
delta = jitter * sqrt(2);
fp = $fopen( periods.m );
Vout = Vlo;
\mathbf{end}
freq = (V(in) - Vmin) * (Fmax - Fmin) / (Vmax - Vmin) + Fmin;
// apply the frequency divider, add the phase noise
freq = (freq)*(1 + dT * freq);
// phase is the integral of the freq modulo 1
phase = idtmod(freq, 0.0, 1.0, -0.5);
// update jitter twice per period
@(cross(phase - 0.25, +1, ttol)) begin
dT = delta * rdist_normal(seed, 0, 1);
Vout = Vhi;
\mathbf{end}
@(cross(phase + 0.25, +1, ttol)) begin
dT = delta *  $rdist_normal(seed, 0, 1);
Vout = Vlo;
if ($abstime >= outStart)
$fstrobe( fp, "%0.10e", $abstime-prev);
prev = $abstime;
\mathbf{end}
V(out) <+ transition(Vout, 0, tt);
\mathbf{end}
endmodule
```

Listing 3.3: Verilog-A model of a square-waveform VCO including jitter



Figure 3.13: A typical circuit implementation of a tri-state PFD followed with a CP

injected once per period. The resulting dithered output frequency in this case is

$$f_{out_i} = \frac{f_{out}}{1 + \sigma_c . \delta_i . f_{out}}.$$
(3.42)

The periods of oscillation are saved in a MATLAB file for further processing where the phase noise spectrum can be plotted using the relationship between cycle jitter and phase noise defined in Eq. (4.18).

# 3.2 Phase-Frequency Detector

A PFD provides both phase and frequency detection capabilities to allow the PLL to lock at the desired frequency. A common implementation is the tri-state PFD which produces two outputs (UP and DN) that control the switches of a subsequent circuit called the charge pump (CP). The PFD/CP block diagram implementation is shown in Fig. 3.13.

The tri-state PFD is a finite-state-machine (FSM) that implements the state diagram shown in Fig. 3.14. The three available states are 00, 01, and 10. The fourth state i.e. 11 is avoided by the AND gate that resets the D-FFs. The operation of the PFD is shown in Fig. 3.15. At the rising edge of the leading input signal, i.e. in this case the reference



Figure 3.14: State machine representation of the PFD



Figure 3.15: PFD output in response to input phase difference

signal, the UP signal switches to high until the arrival of the rising edge of the lagging signal, i.e. in this case the feedback signal, which causes the DN signal to switch high as well. When both UP and DN are high, the D-FFs reset which causes both UP and DN signals to go low again waiting for the next rising edge of an input signal.

When the UP (or DN) signal switches to high, it causes the charge pump to turn on the UP (or DN) current source to charge (or discharge) the output node. The average output current of the charge pump  $\langle i_P \rangle$  versus the phase difference between the input signal is shown in Fig. 3.16. It can be noted that the linear range of the phase detection is  $(-2\pi, 2\pi)$ . Outside this range, the waveform is periodic of  $2\pi$  to provide the frequency detection capability. If one of the two input signal has higher frequency than the other input, the corresponding D-FF will receive more rising edges over the same time interval. This causes the output of one of the D-FFs to remain high for longer time, thereby causing the voltage at the output node to increase or decrease until frequency acquisition is achieved.



Figure 3.16: Ideal phase characteristics of a PFD/CP

## 3.2.1 Non-idealities and limitations

Despite the simplicity of the PFD shown in Fig. 3.13, this architecture may suffer from some practical limitations. The non-idealities of the PFD characteristics due to the dead zone and blind zone are discussed here with solutions to mitigate these limitations [39].

### Dead zone

If the difference between the reference signal and the feedback signal is small, we expect to ideally observe narrow pulses at the outputs of the D-FFs turning on the CP current sources for a short while as shown in Fig. 3.15. Nevertheless, because the UP and DN signals have finite transition time, the situation is different in reality. Assume that the reference signal is leading the feedback signal as shown in Fig. 3.17. If the rising edge of the reference signal is followed closely by a rising edge of the feedback signal, both the UP and DN signals may not have time to reach the maximum level. Once both the UP and DN signals are slightly above the the threshold of the AND gate, the D-FFs will reset as shown in Fig. 3.17(a) while the CP current sources remain off. This means that the PFD/CP does not respond to small phase differences and the PLL is effectively in open-loop operation and in-band noise will appear at the PFD/CP characteristic curve due to the dead-zone phenomenon is shown in Fig. 3.18.

To mitigate this problem, both the UP and DN signals must be allowed to reach to their maximum value (to fully turn on the CP current sources) before allowing the AND gate to reset the D-FFs. This can be done by inserting a delay, e.g. a chain of inverters, after the AND gate such that the two D-FFs are not reset immediately at the AND gate threshold. As shown in Fig. 3.17(b), in this case the UP and DN signals



Figure 3.17: PFD response to small phase difference (a) without delay (b) with delay



Figure 3.18: Phase characteristics of a PFD/CP in the presence of a dead zone

will reach their maximum values turning on both current sources of the CP before both sources are turned off again at the reset of the D-FFs.

### Blind zone

Blind zone in a PFD appears due to missing edges that are not detectable because the PFD is in the reset mode. Fig. 3.19 illustrate the effect of the blind zone. The reference signal is leading the feedback signal which causes the UP signal to be high until the rising edge of the feedback signals arrives. At this time, the DN signal also goes high and after a short delay the two D-FFs reset. However, while the PFD is in the reset mode, it



Figure 3.19: Missing edge due to blind zone in PFD

remains blind to any events at the input. If a rising edge arrived during the reset period as shown in Fig. 3.19, it will not be detected. The blind zone usually occurs when the phase difference is close to  $\pm 2\pi$ . It affects the settling behavior of the PLL and slows down the locking time. The PFD/CP characteristic curve due to the blind zone is shown in Fig. 3.20 where the reversed polarity of the current near  $\pm 2\pi$  is due to the missing edges at the input during the reset mode. The blind zone effectively reduces the phase detection range from  $(-2\pi, +2\pi)$  to  $(-\phi_{in}, +\phi_{in})$  where

$$\phi_{in} = 2\pi (1 - t_{RST} f_{in,max}) \tag{3.43}$$

where  $t_{RST}$  is the reset duration time and  $f_{in,max}$  is the maximum input reference frequency [39].

Therefore, the use of high reference frequency increases the likelihood of blindzone events in the PFD. To mitigate this problem, the reset time of the PFD should be minimized and the maximum operating frequency must be examined carefully.



Figure 3.20: Phase characteristics of a PFD/CP in the presence of a blind zone



Figure 3.21: A general architecture of CP

## 3.3 Charge Pump

A typical charge pump, shown in Fig. 3.21, consists of two ideally matched current sources, i.e.  $I_{UP} = I_{DN} = I_{CP}$ , that pump charges into or out-of the output node. The CP current charges or discharges the output node for a time length  $\Delta t_P$  that is proportional to the phase difference between the two inputs. The average output current  $\langle i_P \rangle$  over the reference period  $T_{ref}$  is

$$\langle i_P \rangle = \frac{\Delta t_P}{T_{ref}} I_{CP} = \frac{\Delta \phi_P}{2\pi} I_{CP}$$
 (3.44)

Thus, the PFD/CP can be represented as a fixed gain that in this case is given as

$$K_P = \frac{\langle i_P \rangle}{\Delta \phi_P} = \frac{I_{CP}}{2\pi} \tag{3.45}$$

### **3.3.1** Non-idealities and limitations

There are many practical considerations and performance limitations that need to be considered in the operation of any CP. Here, we discuss some of the main sources of these limitations that result in non-ideal behavior.

### Output range

Each current source in Fig. 3.21 is usually implemented using at least one MOS transistor in the saturation region. A PMOS transistor is often used to implement  $I_{UP}$  whereas an NMOS transistor is often used to implement  $I_{DN}$ . The output-voltage range wherein both transistors are in saturation limits the valid operating region of the CP. As shown in Fig. 3.22, the maximum operating range of a typical CP is  $V_{DD} - V_{Dsat,N} - |V_{Dsat,P}|$ where  $V_{Dsat}$  is the saturation voltage of the MOS transistor and is given by

$$|V_{Dsat}| = |V_{GS}| - |V_{TH}| = \sqrt{\frac{2I_{CP}}{\mu C_{ox}} \left(\frac{L}{W}\right)}$$
 (3.46)

Therefore, for the same CP current large W/L transistor ratio is usually needed to reduce  $|V_{Dsat}|$  and maximize the operating range. However, large transistors may slow down the CP and limit the operating frequency. Thus, the trade-off between wide operating-range and high operating-speed must be considered.

### Current mismatch

Even within the operating range of the CP, the UP and DN currents are not exactly the same due to finite output impedance of the current-source transistors. Variations in the two currents, if considerable, can affect the PFD/CP gain, i.e.  $K_P = I_{CP}/2\pi$ , thereby changing the loop dynamics. In addition, mismatch between the two currents results in spurs in the output spectrum of the PLL at an offset frequency that is equal to the



Figure 3.22: Output current of a CP versus the output voltage

reference frequency.

To estimate the reference spurs at the PLL output, assume that the UP and DN switches are closed at the same time. Due to the mismatch between the two currents, the current difference will flow into the loop filter thereby changing the control voltage of the VCO. This extra net charge should be corrected in the next reference cycle to change the control voltage in the opposite direction. The resulting ripples at the control voltage of the VCO will have the same frequency as the reference, and will appear at the PLL output as spurs at an offset frequency that is equal to the reference frequency.

The magnitude of the reference spurs in dBc with respect to the carrier in a secondorder loop filter PLL [16] is given by

$$P_s/P_c = 20 \log \left[ \frac{\Delta t_{RST}^2 \,\Delta I \, K_{VCO}}{4\pi C_2} (1 + \frac{\Delta I}{I_{CP}}) \right]$$
(3.47)

where  $\Delta t_{RST}$  is the PFD reset time,  $I_{CP}$  is the CP current,  $\Delta I$  is the CP current mismatch, and  $C_2$  is the second-pole capacitor of the loop filter.

It is evident from Eq. (4.29) that reducing the loop bandwidth (increasing  $C_2$ ), the VCO gain and the PFD reset time; and increasing the CP current are all desirable characteristics to reduce the reference spurs at the PLL output.

#### Transistors speed

The switches and the current sources in the CP do not turn on and off instantaneously, but require finite time to fully turn on or off. Even if the PFD is designed to be deadzone free, the CP should respond quickly enough to ensure high loop-gain at small phase difference [16]. The switching speed can be a limiting factor of the reference frequency.

In addition, the difference in switching speed between the UP and DN currents can exacerbate the mismatch between the two current sources, thereby generating spurs at the output of the PLL. Therefore, it is highly recommended to match the transconductance of both current sources to ensure equal switching speed and reduce reference spurs.

### Charge sharing

In Fig. 3.21, there is a parasitic capacitance associated with the drains of the currentsource transistors. When the switches are open, these nodes are charged to the supply rails. When the switches close a charge transfer occurs between these nodes and the loop filter capacitor. This charge sharing, if substantial, can cause spikes in the CP current and result in spurs at the output of the PLL.

To mitigate this problem, the voltage at these intermediate nodes should be kept close to the voltage at the CP output [40]. This can be done using single transistors or op-amps that operate in negative feedback. Unless the spurs from charge sharing are a major concern, extra circuitry should be avoided to reduce complexity.

### Noise contribution

The CP is a major contributor to the in-band noise of the PLL. The thermal and flicker noise of the current-source transistors is given by

$$\bar{i_n^2} = 4kT\gamma g_m + \frac{K}{f} \tag{3.48}$$

where  $\gamma$  is the white noise gamma factor and K is the 1/f flicker noise constant.

When the PLL is in the locked state, the two current sources are on only for a time interval  $t_{CP}$  where

$$t_{CP} = t_{RST} \left( 1 + \frac{\Delta I}{I_{CP}} \right) \tag{3.49}$$

where  $t_{RST}$  is the PFD reset time.

Thus, the total noise at the output of the CP is

$$i_{n,total}^{2^{-}} = \left(i_{n,N}^{2^{-}} + i_{n,P}^{2^{-}}\right) \cdot \frac{t_{CP}}{T_{ref}}$$
(3.50)



Figure 3.23: CP topologies with switches at (a) gates (b) drains (c) sources

where  $i_{n,N}^{2}$  and  $i_{n,P}^{2}$  are the noise currents associate with the NMOS and PMOS current sources, respectively.

The input-referred noise of the CP is obtained by dividing the expression in Eq. (3.50) by the PFD/CP gain  $K_P = I_{CP}/2\pi$  [16], which results in

$$|\phi_{n,CP}(f)|^2 = \left(\frac{2\pi}{I_{CP}}\right)^2 \left(\bar{i_{n,N}}^2 + \bar{i_{n,P}}^2\right) t_{RST} \left(1 + \frac{\Delta I}{I_{CP}}\right) f_{ref}$$
(3.51)

Therefore, increasing the CP current is a key to reducing both reference spurs and in-band noise contribution at the expense of increased power consumption.

## 3.3.2 Circuit implementations

The three main typical circuit topologies of a CP are shown in Fig. 3.23 [41]. The three topologies use simple current mirrors to generate the UP and DN current of the CP. However, they differ in the position of the CP switches. The switches can be at the gates of the current-source transistors(Fig. 3.23(a)), in series with the drains of the current-source transistors (Fig. 3.23(b)), or in series with the sources of the current-source transistors (Fig. 3.23(b)).

In the gate-switched topology shown in Fig. 3.23(a), a maximum stack-up of two transistors is used; which makes this topology very useful in low-voltage applications. However, the switching time can be a limiting factor in case of high CP currents because of the relatively large gate capacitance which requires long time to charge or discharge.

To alleviate this problem, the switches can be located at the drains as shown in



Figure 3.24: CP circuit implementation with current steering switches

Fig. 3.23(b). The drain and source capacitances are often much smaller than the gate capacitance which enhances the switching speed. In this topology, when the switches are off the drain of the current-source transistors are pulled to the supply rails. When the switches turn on, the voltage at the drains of the current-source transistors makes a fast transition from near the supply rail voltage to the control line voltage; forcing the transistors from the linear region to the saturation region. This results in a high peak current that is dependent on the control-line voltage. This excess current is difficult to control and may result in significant spurs at the PLL output.

In the topology shown in Fig. 3.23(c), the switches are located at the sources which guarantees fast switching. When the switches are off, the sources of the current-source transistors are charged to the control-line voltage rather than the supply rails. When the switches turn on, the sources of the current-source transistors start charging or discharging slowly while both  $|V_{GS}|$  and  $|V_{DS}|$  increase together causing the CP current to change smoothly. This topology is often preferred over drain-switched topology when fast switching and low spurs are desired.

The switching speed can be further improved by using the current steering topology shown in Fig. 3.24. However, this comes at the expense of increased power consumption since a current is always available in the branches of the CP to enhance speed. In addition, the complexity is increased because of the need for the complimentary UP and DN input which should be considered during the design of the PFD.

## 3.3.3 Behavioral Modeling in Verilog-A

In order to simulate the PFD/CP at the system level, a behavioral model that describes the operation of the combined PFD/CP is developed. The model should also include the effect of the PFD/CP noise on the PLL jitter. In most mathematical analysis, we describe the CP output noise in terms of the output noise current. However, this can be difficult for simulators to incorporate because of tight tolerance and small step size that may be required [18]. It is more convenient to convert the current noise into timing jitter and refer it to the input of the PFD/CP as synchronous jitter. The CP behavioral model in Listing 3.4 implements a finite-state machine with three output levels i.e.  $-I_{out}$ , 0, and  $I_{out}$ . The output is incremented or decremented depending on which input is making a transition at the input. The output current is assumed constant over the entire output voltage range. The timing of the output transitions is displaced in time by a random synchronous jitter at the threshold crossing.

```
'include
           disciplines.vams
module pfd_cp (out, ref, fb);
input ref, fb; // inputs: reference & feedback signals
output out;
electrical ref, fb, out;
parameter real Iout=100u;
// dir=1 for positive edge trigger
parameter integer dir=1 from [-1:1] exclude 0;
parameter real tt=1n from (0:inf);
parameter real td=0 from (0:inf);
parameter real jitter=0 from [0:td/5); // edge-to-edge jitter
parameter real ttol=1p from (0:td/5); // ttol << jitter
integer state , seed;
real dt;
analog begin
@(initial_step) seed = 716;
@(cross(V(ref), dir, ttol)) begin
if (state > -1) state = state -1;
dt = jitter * rdist_normal(seed, 0, 1);
\mathbf{end}
@(cross(V(fb), dir, ttol)) begin
if (state < 1) state = state + 1;
dt = jitter * rdist_normal(seed, 0, 1);
end
I(out) <+ transition(Iout*state, td + dt, tt);
end
endmodule
```

Listing 3.4: Verilog-A model of a PFD/CP including jitter

To extract the input-referred jitter of the PFD/CP at the transistor level, the PFD/CP block should be driven with two periodic inputs with some offset phase to produce a representative periodic output. The output current noise spectrum is integrated over the bandwidth of the PLL. To achieve that, the noise spectrum is multiplied by the in-band noise transfer function of the PLL. The integrated noise represents the variance of the current noise at the output of the CP. This variance must be divided by two to account for the two transition events in each time period. The current noise in A is converted to input-referred noise in seconds by dividing the current noise by the PFD/CP effective gain in A/s. The effective gain is calculated by dividing the PFD/CP gain  $K_P$  (in A/cycle) by the reference period  $T_{ref}$ .

Derived from Eq. (4.24) in Chapter 2, the edge-to-edge jitter of the PFD/CP [18] is then given by

$$J_{ee,PFD/CP} = \frac{T_{ref}}{K_P} \sqrt{\frac{var(i_{out})}{2}} , \qquad (3.52)$$

which is the value used in Listing 3.4.

## **3.4** Frequency Dividers

Frequency dividers are very critical components in the design of frequency synthesizer PLLs. Along with the VCO, frequency dividers limit the maximum operating frequency of the PLL. In addition, the range of operating frequencies that a frequency synthesizer PLL can generate is limited by the range of the division-ratio of the frequency divider. Here, we discuss the main divider architectures used in the design of frequency synthesizer PLLs.

## 3.4.1 Prescalar

A prescalar is a frequency divider with a constant division ratio P, where P is often between two and four. A prescalar is usually a simple circuit compared to programmable dividers and can operate at very high frequency. Therefore, they are usually used to divide down the VCO output to relax the speed requirement of the subsequent programmable divider. The most common architectures that are used as high-frequency prescalars are: CML-based prescalar, regenerative frequency divider (RFD), and injection-locked frequency divider (ILFD). The decision to use a particular prescalar architecture is often



Figure 3.25: A conventional circuit implementation of injection-locked divider (ILFD)

based on the trade -off between maximum operating frequency and desired frequency range.

### Injection-locked frequency divider (ILFD)

Injection-locked frequency dividers (ILFD) feature the highest operating frequency and lowest power consumption among high frequency dividers implemented in CMOS technologies [42]. The schematic of a conventional ILFD is shown in Fig. 3.25. The operation principle utilizes the inherent frequency doubling at the common node of the cross-coupled transistors. This is done by injecting the input signal and forcing the resonant frequency to lock at half the input frequency by properly selecting the resonant frequency of the resonator. The main drawback of this topology is the narrow locking range. According to the analysis shown in [43], the locking range of an ILFD is

$$\Delta\omega_L \simeq \frac{\omega_0}{Q} \cdot \frac{2}{\pi} \cdot \frac{I_{inj}}{I_{osc}}$$
(3.53)

where  $\omega_0$  is the resonant frequency, Q is the quality factor of the resonator,  $I_{inj}$  is the injected current, and  $I_{osc}$  is the oscillation current.



Figure 3.26: The concept of operation of an RFD

#### Regenerative frequency divider (RFD)

An alternative to ILFD that can cover a wider frequency range is the regenerative frequency divider (RFD). The concept of an RFD is depicted in Fig. 3.26 [44] where a mixer in a feedback loop with two inputs  $\omega_{in}$  and  $\omega_{out}$  generates an output with the components  $\omega_{in} - \omega_{out}$  and  $\omega_{in} + \omega_{out}$ . By properly selecting the cut-off frequency of the filter to eliminate  $\omega_{in} + \omega_{out}$ , the loop locks to satisfy the relation  $\omega_{out} = \omega_{in} - \omega_{out}$ ; i.e.  $\omega_{out} = \omega_{in}/2$ .

Fig. 3.27 shows a possible implementation of an RFD in using MOS transistors [44]. The locking range of this RFD is

$$\Delta\omega_L \simeq \frac{\omega_0}{Q} \left(\frac{2}{\pi} g_m R\right)^2 \tag{3.54}$$

where  $g_m$  is the transconductance of the bottom differential pair of the mixer and R is the equivalent parallel resistance of the resonator tank.

To compare the locking range of ILFD and RFD topologies, we equate  $\Delta \omega_L$  in Eqs. (3.53) and (3.54), which yields

$$\frac{1}{\pi}g_m^2 R^2 = \frac{I_{inj}}{I_{osc}} \tag{3.55}$$

From Eq. (3.55), we notice that even if we assume that the injection current of an ILFD is equal to the oscillation current i.e.  $I_{inj} = I_{osc}$ , we need a  $g_m R$  of only 1.8 to produce the same locking range. Therefore, we conclude that an RFD provides wider frequency operating range compared to an ILFD.



Figure 3.27: A circuit implementation of an RFD

### CML-based divider

The maximum operating frequency range of a CML-based divider is often less than that of an ILFD or RFD. However, a CML-based divider provides a wider operating range than both ILFD and RFD. The schematic of a conventional CML-based divider is shown in Fig. 3.28 [45]. The circuit is simply a D flip-flop realized using CML logic where the differential output feeds back into the differential input. In order to improve the maximum operating frequency, inductors can be connected in series with the load resistor to reduce the rise and fall times. However, this comes at the expense of reduced operating range and increased area size [45].

## 3.4.2 Programmable Divider

A programmable divider is a necessary component in frequency synthesizer PLLs. The output frequency of the VCO is varied by changing the division ratio of the programmable divider. Due to the complexity of their architectures, programmable dividers usually have lower maximum operating frequency than prescalars or fixed-ratio dividers. Most programmable dividers are based on one of two topologies: dual-modulus programmable divider [46] and divide-by-N programmable dividers [47].



Figure 3.28: A circuit implementation of a CML-based divider

### Dual-modulus programmable divider

The basic cell that is used in building a dual-modulus programmable divider is shown in Fig. 3.29. The circuit is a 2/3 divider. If  $mod_{in}=0$ , the bottom path of the circuit will be interrupted and the circuit will divide by 2. If  $mod_{in}=1$ , the output will be  $f_{in}/2$  if p=0 and  $f_{in}/3$  if p=1. The signal  $mod_{out}$  has the same frequency as  $f_{out}$  but with a delay of half input-cycle.

A programmable divider can be constructed by cascading multiple 2/3 divider cells as shown in Fig. 3.30 [48]. The operation of the programmable divider in Fig. 3.30 is as follows. The  $mod_n$  signal propagates backward in the chain at each input clock cycle. If the mod signal is active the associated cell will divide by 3 if p=1 and divide by 2 if p=0. The output frequency can be controlled by the control word  $p_0 p_1 \ldots p_{n-1}$ , and the divider ratio ranges from  $2^n$  to  $2^{n+1}$ -1.

The structure of a dual-modulus programmable divider is very modular. The same 2/3 divider cell can be used to implement the multiple stages of the divider which can lead to a compact layout and reduce design time. Another advantage of the dual-modulus programmable divider is that the control input has small loading which allows the divider to operate at high frequency. The main disadvantage, however, is that the division range is limited to a less-than-two factor.

A modified circuit shown in Fig. 3.31 extends the division range by simply inserting few OR gates at the higher significant bits of the control word [46]. The dual modulus programmable divider shown in Fig. 3.31 sets the *mod* inputs of some 2/3 divider cells



Figure 3.29: A 2/3 divider cell



Figure 3.30: A conventional architecture of dual-modulus programmable divider

to 1; which effectively shortens the effective length of the divider to m when the control input bits of these cells are high. By independently selecting m and n, the divider ratio is extended to cover the range between  $2^m$  and  $2^{n+1}$ -1.

### Divide-by-N programmable divider

As an alternative to dual-modulus programmable divider, a divide-by-N divider extends the division ratio to cover the range from two to  $2^n - 1$  where n is the number of stages of the divider. A general architecture of a divide-by-N programmable divider is shown



Figure 3.31: A dual-modulus programmable divider with extended division range



Figure 3.32: An architecture of a divide-by-N programmable divider

in Fig. 3.32 [47]. The outputs of the stages are set to the value of the division ratio by the input control circuitry. The outputs are also fed to the end-of-count (EOC) detector. The EOC detector shown in Fig. 3.33(a) detects when the outputs of all stages are zero which indicate the end of the counting process.

The counter starts counting down until the outputs of all the stages become zero. At this point, the EOC detector instructs the control circuit to reload the division ratio to reinitialize the counting and repeat the process. The frequency of the output signal is equal to  $f_{in}/N$  where N is in the range of 2 to  $2^n - 1$ .

In general, divide-by-N programmable dividers are slower than dual-modulus programmable dividers due to the complexity of their control circuitry. Since the speed of the divide-by-N divider is limited by the reloading process, a modification on the EOC detector circuit was proposed by Chang *et al* [47]. In the proposed EOC detector shown



Figure 3.33: Circuit diagram of EOC detector (a) conventional (b) by Chang et al

in Fig. 3.33(b), the RELOAD signal is activated when the counter value reaches  $000001_2$  instead of  $000000_2$ ; which gives the reloading process two clock periods instead of one clock period to execute. Thus, the maximum operating frequency is enhanced.

## 3.4.3 Behavioral Modeling in Verilog-A

A behavioral model that describes the frequency divider using Verilog-A language is shown in Listing 3.5 [18]. The module counts the input transitions using the *cross* function to detect threshold crossing. At each threshold crossing, the count is incremented, and when the count reaches to the final value *ratio* the count is reset to zero. If *count* is above the midpoint n is set to high, and if it is below the midpoint n is set to low. A random jitter is added at every transition with rms value equal to the edge-to-edge jitter extracted from the transistor-level divider block.

To extract the edge-to-edge jitter of a frequency divider, the divider block is driven with a representative input. At the threshold crossing, both the threshold-crossing amplitudes and the slew rate are evaluated using the simulator periodic-steady-state (PSS) strobe analysis as shown in Fig. 3.34. The power spectral density  $S_{n_v}$  of the strobed noise is integrated to compute the total noise at the sample points i.e.

$$var(n_v(t_c)) = \int_0^{f_0/2} S_{n_v}(f, t_c) \, df \,. \tag{3.56}$$

The edge-to-edge jitter is then computed from Eq. (4.24) as

$$\sigma_{ee} = \frac{\sqrt{var\left(n_v(t_c)\right)}}{dv(t_c)/dt} \ . \tag{3.57}$$

```
'include
           disciplines.vams
module divider (out, in);
input in; output out; electrical in, out;
parameter real Vlo=-1, Vhi=1;
parameter integer ratio=2 from [2:inf);
parameter integer dir=1 from [-1:1] exclude 0;
// dir=1 for positive edge trigger
parameter real tt=1n from (0:inf);
parameter real td=0 from (0:inf);
parameter real jitter=0 from [0:td/5);
parameter real ttol=1p from (0:td/5);
integer count, n, seed;
real dt;
analog begin
@(initial\_step) seed = -311;
@(cross(V(in)-(Vhi + Vlo)/2, dir, ttol)) begin
// count input transitions
count = count + 1;
if (count >= ratio)
count = 0;
n = (2 * count >= ratio);
// add jitter
dt = jitter * rdist_normal(seed,0,1);
\mathbf{end}
V(out) <+ transition(n ? Vhi : Vlo, td+dt, tt);
end
endmodule
```

Listing 3.5: Verilog-A model of a frequency divider including jitter



Figure 3.34: Strobed noise at the threshold-crossing points of signal  $v_n(t)$ 

## 3.5 Loop Filter

Frequency synthesizers are designed for a given set of specifications such as phase noise, locking time, frequency range, step size, and loop phase margin. The loop filter plays a crucial role in determining the performance of the PLL. The loop filter affects both the in-band and out-of-band phase noise contribution to the overall phase noise, as well as the amount of attenuation imposed on the reference frequency spurs. In addition, the loop filter affects loop dynamics (e.g. locking time, overshooting and peak time) and loop stability (usually ensured by sufficient phase margin). In general, the loop filter is the most flexible PLL block under the control of the designer.

Techniques to design various passive and active loop filters are widely discussed in the literature [16], [49]. However, the lack of a detailed qualitative and quantitative analysis makes the job of loop filter selection daunting and demands lots of trial-and-error steps. In this paper, we provide an explicit comparison between different passive loop filter topologies and their effect on design parameters such as locking time, reference spurs attenuation, phase noise, and loop phase margin. A quantitative comparison is provided (whenever possible) to assist the designer in selecting the optimum design for the desired specifications.

As shown in Chapter 2, the closed-loop transfer-function of a PLL with respect to the input-referred in-band noise is given by

$$H_{in}(s) = \frac{\phi_{out}(s)}{\phi_{ref}(s)} = \frac{F_{out}(s)}{F_{ref}(s)} = N \cdot \frac{\frac{K_P K_{VCO}}{N \cdot s} \cdot Z(s)}{1 + \frac{K_P K_{VCO}}{N \cdot s} \cdot Z(s)}$$
(3.58)

The output response to a small change  $\Delta n$  in the divider ratio N (i.e.  $\Delta n < 0.04N$ ) can be approximated [15] as

$$F_{out}(s) \simeq N \cdot \frac{\frac{K_P K_{VCO}}{N.s} \cdot Z(s)}{1 + \frac{K_P K_{VCO}}{N.s} \cdot Z(s)} \cdot \left(1 + \frac{\Delta n}{N}\right) F_{ref}(s).$$
(3.59)

The locking (settling) time to  $\alpha$  of the final value for an approximated second order PLL can be given as

$$T_L \simeq \frac{1}{\zeta \omega_n} \ln\left(\frac{\Delta n}{N |\alpha| \sqrt{1 - \zeta^2}}\right)$$
(3.60)

where  $\zeta$  is the second order damping factor,  $\alpha$  is the fraction of the final value, and  $\omega_n$  is the loop natural frequency, and it is directly related to the loop bandwidth  $\omega_c$  which is the frequency at which the magnitude of the transfer function is unity.

The reference spurs attenuation can be found by evaluating the magnitude of the transfer function  $\frac{\phi_{out}(s)}{\phi_{ref}(s)}$  at  $s = j\omega_{ref}$ , where  $\omega_{ref}$  is the reference frequency. The amount of attenuation depends on the ratio  $\omega_{ref}/\omega_c$  rather than the absolute values of  $\omega_{ref}$  and  $\omega_c$ . The attenuation deteriorates by  $20\log N$  where N is the divider ratio. The loop stability can be maintained by ensuring sufficient phase margin. The phase margin not only affects the stability but also affects the loop dynamics such as locking time and overshooting. It can be proved by extensive simulations that a PLL with phase margin around 50° results in the minimum locking time [15], [50].

## 3.5.1 Passive Filter Design

Passive filters are preferred over active filters when noise performance is very critical. The two main types of passive filters, namely RC network and LC ladder, are analyzed.

### **RC** Filters

The simplest type II PLL RC loop filter is shown in Fig. 3.35(a). It consists of a capacitor  $C_1$  in series with a resistor  $R_1$ . The series combination results in a pole at dc and a zero at  $1/R_1C_1$ . The loop filter is first order and the resulting PLL is second order. The PLL design in this case is very simple, since second order systems are very well understood.

If an additional capacitor  $C_2$  is added in parallel for further spurs attenuation as shown in Fig. 3.35(b), then the loop filter becomes second order, and the resulting PLL is third order.



Figure 3.35: RC network loop filters: (a) 1<sup>st</sup> order (b) 2<sup>nd</sup> order (c) 3<sup>rd</sup> order

In this case, the filter impedance is given by:

$$Z(s) = \frac{1 + \tau_1 s}{sC_T} \cdot \frac{1}{1 + a_1(\tau_1 s)}$$
(3.61)

where  $a_1 = C_2/C_T$  and  $C_T = C_1 + C_2$ .

The resulting loop gain of the third order PLL is expressed as

$$LG(s) = \frac{K_{VCO}K_P}{N} \cdot \frac{1 + \tau_1 s}{s^2 C_T} \cdot \frac{1}{1 + a_1(\tau_1 s)}.$$
(3.62)

The loop gain magnitude and phase response of a third order PLL is shown in Fig. 3.36. The loop phase margin PM is defined as the phase shift at which the loop gain magnitude is unity and can be expressed in this case as

$$PM = \tan^{-1}(\tau_1 \omega_c) - \tan^{-1}(a_1(\tau_1 \omega_c)).$$
(3.63)

Selecting  $\omega_c$  to yield the maximum phase margin can ensure stability over wide range reducing susceptibility to other parameters variations. The maximum phase margin can be found by differentiating Eq. (3.63) with respect to  $\tau_1 \omega_c$ . This will result in the phase margin being maximized when  $\tau_1 \omega_c = 1/\sqrt{a_1}$ . Thus,

$$PM_{max} = \tan^{-1}(1/\sqrt{a_1}) - \tan^{-1}(\sqrt{a_1}).$$
(3.64)



Figure 3.36: Loop gain response of a third-order PLL

The relationship between  $a_1$  and  $PM_{max}$  is shown in Fig. 3.37. For instance, in order to obtain  $PM_{max} = 50^{\circ}$  to maximize locking time,  $a_1$  is chosen to be 0.1325.

Since the loop gain magnitude is unity at the loop bandwidth  $\omega_c$ ,  $C_T$  can be evaluated from Eq. (3.62) as

$$C_T = \frac{K_{VCO}K_P/N}{\omega_c^2} \cdot \sqrt{\frac{1 + (\tau_1\omega_c)^2}{1 + [a_1(\tau_1\omega_c)]^2}}.$$
(3.65)

Once  $C_T$  is calculated, one can directly calculate  $C_1$  and  $C_2$  from  $a_1$  and  $C_T$ , i.e.  $C_2 = a_1 C_T, C_1 = C_T - C_2.$ 

The impedance of a general  $n^{th}$  order RC filter can be expressed as

$$Z(s) = \frac{1 + \tau_1 s}{sC_T} \cdot \frac{1}{1 + a_1(\tau_1 s) + a_2(\tau_1 s)^2 + \dots + a_n(\tau_1 s)^n}$$
(3.66)

where

$$a_1 = \frac{C_T - C_1}{C_T} + \sum_{i=3}^n \frac{\tau_i}{\tau_1} \sum_{j=1}^{i-1} \frac{C_j}{C_T} + \frac{\sum_{i=4}^{n-1} C_i \sum_{j=3}^{i-1} R_j}{\tau_1} \sum_{k=1}^{j-1} \frac{C_k}{C_T}$$
(3.67)

and  $C_T = \sum_{i=1}^n C_i$ .

ź

The phase margin of a PLL with  $n^{th}$  order RC filter is

$$PM = \tan^{-1}(\tau_1\omega_c) - \tan^{-1}\left(\frac{a_1(\tau_1\omega_c) - a_3(\tau_1\omega_c)^3 + \dots}{1 - a_2(\tau_1\omega_c)^2 + a_4(\tau_1\omega_c)^4 + \dots}\right).$$
 (3.68)



Figure 3.37: Maximum phase margin vs. ratio  $a_1$ 

Alternatively, the RC filter impedance in Eq. (3.66) can be rewritten [15] as

$$Z(s) = \frac{1 + \tau_1 s}{sC_T} \cdot \frac{1}{\prod_{i=2}^n (1 + \tau_i s)}$$
(3.69)

where  $\tau_1 = R_1 C_1$ ,  $\tau_2 \simeq \tau_1 C_2 / C_T$ , and  $\tau_i = R_i C_i$  for  $i = 3, 4, \ldots, n$ . The approximations are valid provided that  $C_i \ll C_1$  and  $\frac{C_i}{C_{i+1}} + \frac{R_{i+1}}{R_i} \gg 1$ . This results in  $a_i \ll a_1$  and  $\tau_i \ll \tau_1$ for  $i = 2, 3, 4, \ldots, n$ , and Eq. (3.68) can be approximated as

$$PM \simeq \tan^{-1}(\tau_1 \omega_c) - \tan^{-1}(a_1(\tau_1 \omega_c)) ,$$
 (3.70)

and  $PM_{max}$  can be approximated as a function of  $a_1$  expressed as

$$PM_{max} \simeq \tan^{-1}(1/\sqrt{a_1}) - \tan^{-1}(\sqrt{a_1}).$$
 (3.71)

Therefore, higher order filter can be designed to yield  $a_1$  that results in the desired maximum phase margin. For example,  $a_1$  for a third order filter can be written using Eq. (3.67), and given that  $C_2 \ll C_1$ , as

$$a_1 = 1 - \frac{C_1}{C_T} + \frac{\tau_3}{\tau_2} \cdot \frac{C_2}{C_T} \cdot \frac{C_1}{C_T}.$$
(3.72)

Therefore, the capacitors relationships can be expressed as

$$\frac{C_2}{C_T} \simeq \frac{\frac{C_1}{C_T} + a_1 - 1}{\frac{\tau_3}{\tau_2} \cdot \frac{C_1}{C_T}}$$
(3.73)

and

$$\frac{C_3}{C_T} = 1 - \frac{C_1}{C_T} - \frac{C_2}{C_T}.$$
(3.74)

Similarly,  $a_1$  for a fourth order filter can be written as

$$a_{1} = 1 - \frac{C_{1}}{C_{T}} + \frac{\tau_{3}}{\tau_{2}} \frac{C_{2}}{C_{T}} \left( \frac{C_{1}}{C_{T}} + \frac{C_{2}}{C_{T}} \right) + \frac{\tau_{4}}{\tau_{3}} \frac{\tau_{3}}{\tau_{2}} \frac{C_{2}}{C_{T}} \left( \frac{C_{1}}{C_{T}} + \frac{C_{2}}{C_{T}} + \frac{C_{3}}{C_{T}} \right) + \frac{\tau_{3}}{\tau_{2}} \frac{C_{2}}{C_{T}} \frac{C_{4}}{C_{3}} \left( \frac{C_{1}}{C_{T}} + \frac{C_{2}}{C_{T}} \right)$$
(3.75)

and

$$\frac{C_4}{C_T} = 1 - \frac{C_1}{C_T} - \frac{C_2}{C_T} - \frac{C_3}{C_T}.$$
(3.76)

The capacitor values should be chosen to satisfy these equations while keeping the aforementioned approximation valid.

The poles ratios determination is very critical in RC filters. Since RC networks can have only simple real poles, it can be proved that for the capacitor values to be positive, the pole ratio  $\tau_{i+1}/\tau_i$  must be less than 1 [50]. In addition, extensive simulations in [50] show that for third and fourth order RC filters, choosing  $\tau_3/\tau_2$  and  $\tau_4/\tau_3$  between 0.5 and 0.6 yields the maximum benefit of the extra poles while maximizing  $C_n$  to avoid being loaded by the oscillator capacitance.

Once the poles ratios are chosen, the capacitance ratios can be solved numerically or graphically to choose the proper capacitance values. Fig. 3.38 shows the capacitor ratios for third order RC filters for pole ratio  $\tau_3/\tau_2 = 0.6$ .

The condition on the loop gain at the loop bandwidth can be utilized approximately to calculate  $C_T$  as

$$C_T = \frac{K_{VCO}K_P/N}{\omega_c^2} \cdot \sqrt{\frac{1 + (\tau_1\omega_c)^2}{\left[a_1(\tau_1\omega_c) - a_3(\tau_1\omega_c)^3 + \dots\right]^2 + \left[1 - a_2(\tau_1\omega_c)^2 + a_4(\tau_1\omega_c)^4 + \dots\right]^2}}$$
(3.77)

If  $a_i \ll a_1$ , then  $C_T$  can be reduced to

$$C_T = \frac{K_{VCO}K_P/N}{\omega_c^2} \cdot \sqrt{\frac{1 + (\tau_1\omega_c)^2}{1 + [a_1(\tau_1\omega_c)]^2}} .$$
(3.78)

In order to compare the extra attenuation added by the increase of the order of



Figure 3.38: The relationship between the capacitor ratios in a 3<sup>rd</sup>-order RC loop filter

the loop filter, simulations were executed for  $PM_{max} = 50^{\circ}$  using different orders of RC filters. The locking time  $T_L$  is calculated for a frequency step of  $\Delta n = 0.03N$ , and is normalized to the reference cycle  $T_r$ . The reference spurs attenuation  $A_s$  in dB is calculated for a divider ratio of unity. For a PLL with a divider ratio of N, an extra  $20 \log N$  should be added. Therefore, the simulation results can be used as a reference to any PLL design using the general RC filter shown in Fig. 3.35(c). Simulation results for locking time and spurs attenuation for different orders of RC filters are shown in Table 3.1 for  $PM_{max} = 50^{\circ}$ .

The general design procedure for  $(n+1)^{th}$  order PLL using an  $n^{th}$  order RC filter is as follows:

- 1. Select  $PM_{max} = 50^{\circ}$  and find corresponding  $a_1$  and  $\tau_1 \omega_c$ .
- 2. Select the loop bandwidth  $\omega_c$  that achieves the desired locking time and oscillator noise filtering, and find  $\tau_1$ .
- 3. Select the order of the RC filter that achieves the desired spurs attenuation.
- 4. Given  $K_{VCO}K_P/N$ , calculate  $C_T$ .
- 5. Using numerical or graphical approach, solve for the capacitor ratios, and select  $C_1, C_2, \ldots, C_n$ .
- 6. Calculate  $\tau_2, \tau_3, \ldots, \tau_n$  from the chosen pole ratios.
- 7. Calculate the resistor values:  $R_i = \tau_i/C_i$ .
- 8. If the noise contribution from  $R_i$  is large, return to step 5 and adjust the capacitors values.

|  |   |                 | 1                                    |                   |                                      |                   |                                      |                   | 0                                     |                   |                                       |                   |
|--|---|-----------------|--------------------------------------|-------------------|--------------------------------------|-------------------|--------------------------------------|-------------------|---------------------------------------|-------------------|---------------------------------------|-------------------|
|  | n | $	au_1\omega_c$ | $\frac{\omega_{ref}}{\omega_c} = 10$ |                   | $\frac{\omega_{ref}}{\omega_c} = 20$ |                   | $\frac{\omega_{ref}}{\omega_c} = 50$ |                   | $\frac{\omega_{ref}}{\omega_c} = 100$ |                   | $\frac{\omega_{ref}}{\omega_c} = 200$ |                   |
|  |   |                 | $A_s$                                | $\frac{T_L}{T_r}$ | $A_s$                                | $\frac{T_L}{T_r}$ | $A_s$                                | $\frac{T_L}{T_r}$ | $A_s$                                 | $\frac{T_L}{T_r}$ | $A_s$                                 | $\frac{T_L}{T_r}$ |
|  | 2 | 2.75            | -31                                  | 12                | -43                                  | 23                | -59                                  | 58                | -71                                   | 116               | -83                                   | 230               |
|  | 3 | 2.75            | -31                                  | 12                | -46                                  | 23                | -68                                  | 58                | -85                                   | 116               | -103                                  | 230               |
|  | 4 | 2.75            | -31                                  | 12                | -59                                  | 23                | -92                                  | 58                | -116                                  | 116               | -140                                  | 230               |

Table 3.1: Reference spurs attenuation and locking time for different RC loop filters

### **LC-Ladder Filters**

Since the reference frequency is often small (in the range of kHz or few MHz), the size of the resulting filter components is usually large. Therefore, it is not very uncommon to use off-chip loop filters. Off-chip filters provide the advantage of extra flexibility to account for process variations in the fabricated chips. If an off-chip filter is implemented, area is no longer a considerable concern. Therefore, LC ladder filters can be used.

LC ladder filters (such as Butterworth and Chebychev) provide faster roll-off at the cut-off frequency since the poles can be positioned in the complex plane [51]. In addition, LC components are ideally noiseless, which makes it appealing to build very low-noise PLLs. However, there are many practical challenges that can weigh off the advantages of LC filters.

In order to use LC ladders in higher order type II PLLs, a first order RC filter (needed to form a pole at dc) is cascaded with the LC ladder. The LC ladder will form the extra poles that result in the extra roll-off at the frequency  $1/\tau_2$ , and will achieve the desired attenuation at the reference frequency. This is conceptually illustrated in Fig. 3.39.



(a) LG(s) of a second order PLL (b) Transfer Function of LC ladder (c) LG(s) of the resulting PLL

Figure 3.39: Structure and response of an  $n^{th}$ -order LC-ladder loop filter

The filter impedance of the  $n^{th}$ -order loop filter in Fig. 3.39 is given by

$$Z(s) = \frac{1 + \tau_1 s}{s} \cdot \frac{1}{a_0 + a_1 \left(\frac{\tau_2}{\tau_1} \tau_1 s\right) + \ldots + a_{n-1} \left(\frac{\tau_2}{\tau_1} \tau_1 s\right)^{n-1}} \cdot (3.79)$$

and the phase margin is

$$PM = \tan^{-1}(\tau_1\omega_c) - \tan^{-1}\left(\frac{a_1\left(\frac{\tau_2}{\tau_1}\right)(\tau_1\omega_c) - a_3\left(\frac{\tau_2}{\tau_1}\right)^3(\tau_1\omega_c)^3 + \dots}{a_0 + a_2\left(\frac{\tau_2}{\tau_1}\right)^2(\tau_1\omega_c)^2 + a_4\left(\frac{\tau_2}{\tau_1}\right)^4(\tau_1\omega_c)^4 - \dots}\right)$$
(3.80)

where  $a_0, a_1, \ldots, a_n$  are the normalized Butterworth/Chebychev coefficients, and the pole ratio ensures converting the cut-off frequency to the desired value.

The condition of unity loop gain at the loop bandwidth results in

$$\frac{K_{VCO}K_P/N}{\omega_c^2} \cdot \sqrt{\frac{1 + (\tau_1\omega_c)^2}{[a_1(\tau_1\omega_c) - a_3(\tau_1\omega_c)^3 + \dots]^2 + [a_0 - a_2(\tau_1\omega_c)^2 + a_4(\tau_1\omega_c)^4 + \dots]^2}} = 1.$$
(3.81)

To find  $\tau_1 \omega_c$  that results in maximum phase margin  $PM_{max}$ , Eq. (3.80) must be differentiated with respect to  $\tau_1 \omega_c$  for different  $\tau_2/\tau_1$  ratios. For a desired  $PM_{max}$ , certain  $\tau_1 \omega_c$  and  $\tau_2/\tau_1$  should be selected.

In order to illustrate the procedure, a second order Butterworth section is considered. When cascaded with a preceding capacitor-resistor stage, this will result in a third order loop filter. The coefficients of a normalized second order Butterworth are:  $a_0 = a_2 = 1$ and  $a_1 = \sqrt{2}$ . Therefore, Eq. (3.80) becomes

$$PM = \tan^{-1}(\tau_1 \omega_c) - \tan^{-1} \left( \frac{\sqrt{2} \left( \frac{\tau_2}{\tau_1} \right) (\tau_1 \omega_c)}{1 - \left( \frac{\tau_2}{\tau_1} \right)^2 (\tau_1 \omega_c)^2} \right) .$$
(3.82)

This equation is differentiated with respect to  $\tau_1 \omega_c$  for different values of  $\tau_2/\tau_1$ . The roots of the derivative equation will always result in one  $\tau_1 \omega_c$  that is real and positive, and that will yield maximum phase margin. To satisfy the condition of unity gain at loop bandwidth, the value of  $(K_{VCO}K_P/N)/\omega_c^2$  should meet Eq. (3.81) where  $\tau_1 \omega_c$  and  $\tau_2/\tau_1$  are the values chosen to yield the desired  $PM_{max}$ . This imposes some


Figure 3.40: Effect of lowering the zero-frequency on loop-gain attenuation

constraints on choosing the values of  $K_{VCO}K_P/N$  and the loop bandwidth  $\omega_c$  in the LC ladder filter design. The absence of the term  $C_T$  that appears in the RC network filters reduces the flexibility of the design. Solving the second-order LC ladder equation for  $PM_{max} = 50^{\circ}$  results in  $\tau_2/\tau_1 = 0.09$  and  $(K_{VCO}K_P/N)/\omega_c^2 = 0.37$ . In a typical PLL design,  $(K_{VCO}K_P/N)/\omega_c^2$  ranges between  $10^{-14}$  and  $10^{-6}$ . Therefore, achieving a large value of  $(K_{VCO}K_P/N)/\omega_c^2$  requires dramatic reduction in the loop bandwidth  $\omega_c$  which offsets the benefit of the low noise of LC filters.

The reason for the added complexity in the design of LC filters is that the extra roll-off at  $1/\tau_2$  caused by the poles in the complex plane deteriorates the phase margin and makes it difficult to stabilize the loop. To ensure stability, we end up pushing the loop bandwidth  $\omega_c$  to a very low frequency to guarantee that the added phase shift at  $1/\tau_2$  will not cause the loop to become unstable. If designed to yield maximum phase

| n | $	au_1\omega_c$            | $\tau_2$                         | $\frac{\omega_{ref}}{\omega_c} = 10$ |                   | $\frac{\omega_{ref}}{\omega_c} = 20$ |                   | $\frac{\omega_{ref}}{\omega_c} = 50$ |                   | $\frac{\omega_{ref}}{\omega_c} = 100$ |                   | $\frac{\omega_{ref}}{\omega_c} = 200$ |                   |
|---|----------------------------|----------------------------------|--------------------------------------|-------------------|--------------------------------------|-------------------|--------------------------------------|-------------------|---------------------------------------|-------------------|---------------------------------------|-------------------|
|   |                            | $	au_1$                          | $A_s$                                | $\frac{T_L}{T_r}$ | $A_s$                                | $\frac{T_L}{T_r}$ | $A_s$                                | $\frac{T_L}{T_r}$ | $A_s$                                 | $\frac{T_L}{T_r}$ | $A_s$                                 | $\frac{T_L}{T_r}$ |
| 3 | $10^{-\log(K/\omega_c^2)}$ | $10^{\log(K/\omega_c^2) - 0.34}$ | -46                                  | 9                 | -64                                  | 18                | -88                                  | 46                | -106                                  | 91                | -124                                  | 182               |

Table 3.2: Spurs attenuation and locking time for 3<sup>rd</sup> order Butterworth filter

margin at loop bandwidth,  $\omega_c$  will dramatically be reduced to cope with the added phase.

Alternatively, in order to achieve larger loop bandwidth, the zero can be pushed to a very low frequency (instead of  $1/\tau_1$ ), and the filter is designed to yield the desired phase margin at  $\omega_c$  (but not  $PM_{max}$ ). This can be mathematically done by plugging in the value  $(K_{VCO}K_P/N)/\omega_c^2$  of the design and solving Eq. (3.80) and (3.81) for  $\tau_1\omega_c$ and  $\tau_2/\tau_1$ . Pushing the zero towards low frequency results in less attenuation of the oscillator phase noise at low frequency, that is, in the  $1/f^3$  region. Fig. 3.40 illustrates the effect of lowering the zero frequency on the loop gain response and on the oscillator noise filtering. One can find general formulae of  $\tau_1\omega_c$  and  $\tau_2/\tau_1$  for a third order LC filter using an arbitrary  $(K_{VCO}K_P/N)/\omega_c^2$ . Table 3.2 shows the case of a third order Butterworth where  $K = (K_{VCO}K_P/N)/\omega_c^2$ . It is worth noting that the inductors and capacitors are assumed to be ideal in the simulations. Accounting for finite quality factor of these elements may require slight modification as noise can be introduced due to the resistive parasitics. In general, LC loop filters remain impractical mainly due to loop stability concerns they introduce.

#### 3.5.2 Active Filter Design

Although active devices contribute additional noise to the in-band noise of the PLL, their use is sometimes inevitable. Typical example is when the oscillator varactor has a larger voltage range than the supply rails of the charge pump, or when the output voltage of the charge pump is reduced due to the series resistor in the RC network.

Active filters can be derived from their corresponding passive filters. The resulting active filter has similar properties to the passive filter derived from. Fig. 3.41 shows gain incorporation in RC networks, where the gain is added to the first stage. Another advantage of using active loop filters is that it allows omitting the charge pump by connecting the differential active device directly to the PFD [16]. A second-order differential active loop filter is shown in Fig. 3.42 This topology can be adopted if the charge pump non-linearity and noise are the most demanding elements to meet the in-band specifications [16]. The gain of the active filter provides an extra degree of freedom and the absence of inductors allows implementing higher order filters on-chip. Incorporating active components allows locating the poles and zeros anywhere in the complex plane. The additional noise depends on the active devices used in the design. In addition, op-amp non-idealities can deteriorate the performance of the overall filter. Therefore, more care should be given to the active devices design.

As opposed to the classical design methodology in which the open-loop gain and the pole/zero locations are set to meet a certain phase margin to ensure stability, the PLL closed-loop transfer function can be chosen directly from the onset by the designer. Therefore, the PLL can be designed to yield certain transfer function and response, and the loop filter impedance can alternatively be derived and implemented. This can be done by solving Eq. (3.58) for Z(s), i.e.

$$Z(s) = \frac{s}{K_{VCO}K_P/N} \cdot \frac{H_{fb}(s)}{1 - H_{fb}(s)} .$$
(3.83)

where the feedback transfer function  $H_{fb}(s) = H_{in}(s)/N$ .

Assume the desired PLL transfer function is

$$H_{fb}(s) = \frac{a_0}{b_n s^n + b_{n-1} s^{n-1} + \dots + b_1 s + b_0} , \qquad (3.84)$$

then substituting Eq.(3.84) into Eq.(3.83), the loop filter impedance is given by

$$Z(s) = \frac{N.s}{K_{VCO}K_P} \cdot \frac{a_0}{b_n s^n + b_{n-1}s^{n-1} + \dots + b_1s + (b_0 - a_0)}$$
(3.85)

As can be noted, the resulting filter impedance has a zero at dc. However, to obtain a type II PLL, the loop filter must have a pole at dc to make the total number of integrators in the loop equal two.

In order to introduce an extra pole at dc in the PLL transfer function  $H_{fb}$ , two methods will be discussed here.



Figure 3.41: An  $n^{th}$  order active loop filter



Figure 3.42: Second-order differential active loop filter

#### Lead-Lag Filter

Rather than implementing the transfer function in Eq.(3.84), the transfer function is modified by multiplying it by a lead-lag filter as proposed in [52]. Thus, the new transfer function is

$$H_{fb}(s) = \frac{a_0}{b_n s^n + b_{n-1} s^{n-1} + \dots + b_1 s + b_0} \cdot \frac{1 + s/\omega_z}{1 + s/\omega_p} .$$
(3.86)

where practical zero-to-pole ratio  $\omega_z/\omega_p$  value are in the range of 1/10 and 1/3 [52].

By expanding Eq. (3.86) and substituting it into Eq. (3.83), we obtain

$$Z(s) = \frac{N.s}{K_{VCO}K_P} \times \frac{a_0 \left(1 + \frac{s}{\omega_z}\right)}{\left(\frac{b_n}{\omega_p}\right)s^{n+1} + \left(\frac{b_{n-1}}{\omega_p} + b_n\right)s^n + \left(\frac{b_{n-2}}{\omega_p} + b_{n-1}\right)s^{n-1} + \dots + b_1 - \frac{b_0}{\omega_p} - \frac{a_0}{\omega_z}s + (b_0 - a_0)}.$$
(3.87)

If we set  $a_0 = b_0$  and  $b_1 - \frac{b_0}{\omega_p} - \frac{a_0}{\omega_z} = 0$ , then

$$\omega_z = \frac{b_0 \omega_p}{b_0 - b_1 \omega_p} , \qquad (3.88)$$

and the filter impedance becomes

$$Z(s) = \frac{N}{K_{VCO}K_P} \cdot \frac{1}{s} \times \frac{a_0 \left(1 + \frac{s}{\omega_z}\right)}{\left(\frac{b_n}{\omega_p}\right) s^{n-1} + \left(\frac{b_{n-1}}{\omega_p} + b_n\right) s^{n-2} + \dots + \left(\frac{b_2}{\omega_p} + b_3\right) s + \left(\frac{b_1}{\omega_p} + b_2\right)} \quad (3.89)$$

As clearly evident from the equation, a pole has been introduced in the impedance filter at dc which allows implementing a type II PLL.

#### Zero Positioning

Another method to introduce a pole at dc in the impedance filter is by proper positioning of the zeros in the PLL transfer function  $H_{in}(s)$  as suggested in [53]. By introducing a zero in Eq. (3.84), we can write

$$H_{fb}(s) = \frac{a_1 s + a_0}{b_n s^n + b_{n-1} s^{n-1} + \dots + b_1 s + b_0} , \qquad (3.90)$$

By setting  $a_0 = b_0$  and  $a_1 = b_1$  in Eq. (3.90), we can write

$$H_{fb}(s) = \frac{b_1 s + b_0}{b_n s^n + b_{n-1} s^{n-1} + \dots + b_1 s + b_0},$$
(3.91)

and the filter impedance becomes

$$Z(s) = \frac{N}{K_{VCO}K_P} \cdot \frac{1}{s} \cdot \frac{b_1 s + b_0}{b_n s^{n-2} + b_{n-1} s^{n-3} + \dots + b_3 s + b_2} .$$
(3.92)

Similar to the expression in Eq.(3.89), the resulting loop filter from this method has a pole at dc and allows type II PLL implementation. The main advantage of this method is that it results in a loop filter of one order less than that obtained by introducing a lead-lag filter

## 3.6 Summary

In this chapter, we discussed the design of the individual building blocks of the PLL; namely VCO, PFD/CP, frequency dividers, and loop filter. For each of the VCO, PFD/CP, and frequency dividers, we discussed the principle of operation, non-idealities, performance metrics, and behavioral modeling using Verilog-A. Examples of transistorlevel implementation of each of these blocks were presented and discussed. Finally, we discussed the various choices available for the design of the loop filter which directly impact the performance of the PLL. Analytical comparison between different loop filter topologies and orders was presented with detailed qualitative and quantitative analysis.

## Chapter 4

# Top-Down Design Including Loop Variations in Wide-Range PLLs

In this chapter, we demonstrate a complete methodology to model, design, and implement wide tuning-range frequency synthesizer PLLs using a top-down approach. Mathematical equations that illustrate the contribution of the different sources of noise in the PLL are presented. Behavioral models that encompass the non-idealities of the PLL components are described using Verilog-A language. The PLL components are designed and noise performance of each component is evaluated using transistor-level simulations. The extracted jitter from the individual blocks is used to find the over-all system noise. The proposed methodology takes into account the variations in the loop dynamics due to changes in the VCO gain and noise, frequency divider ratio, and charge pump current. While optimizing the PLL for maximum tuning-range, the methodology also considers the trade-off between noise, speed, and reference spurs attenuation. The design and implementation of an integer-N frequency synthesizer PLL that covers a continuous frequency range from 156.25 MHz to 10 GHz using a 65 nm CMOS technology is demonstrated in this chapter. Measurement results to verify the accuracy of the models and validate the predictions made by simulations are provided.

## 4.1 Design Methodology

To improve the yield of a silicon implementation in circuit design, designers need to be able to predict and verify the performance of the circuit before the chip tape-out in a fast and effective way. In general, design verification is an extremely important step in circuit design. The availability of an accurate and fast verification method to design circuits is critical in speeding the design process and reducing the total turn-around time. In PLL design, this is even more complex. Due to the mixed-signal nature of the PLL where the signal type (i.e. analog or digital) and the operating frequency vary from one block to another, transistor-level closed-loop simulations become very burdensome. Predicting the transient behavior or the phase noise of a PLL through brute-force closedloop simulations at transistor-level is impractical mainly due to long settling time of the PLL especially for large frequency divider ratios, which also requires large computation memory [54]. The simulations may also suffer from convergence problems due to the different operating frequencies involved in the circuit. The design of wide-range PLLs poses greater challenge due to the vast changes in the loop parameters. In a wide-range PLL, large variations in the VCO gain and jitter, charge pump current, and frequency divider ratio are expected. These variations directly affect the PLL transient behavior and noise performance.

Therefore, we need a methodology to precisely predict the performance of the PLL without brute-force closed-loop simulations at transistor-level. The methodology should also incorporate the variations in the performance of each block and its effect on the entire PLL to accommodate for wide frequency-range operation. Open-loop simulations of the individual blocks at the transistor-level are feasible and can be achieved in short time and using affordable computational memory. By incorporating the data collected from the transistor-level open-loop simulations into behavioral models of the PLL building blocks, one can predict the closed-loop performance of the PLL from system-level.

The approach pursued in the design of the wide-range frequency synthesizer PLL presented in this chapter follows the flow chart shown in Fig. 4.1. First, the specifications of the frequency synthesizer PLL are decided upon by the designer. Second, an initial set of behavioral models of the PLL components are built. The initial behavioral models may not encompass all the variations in the loop due to wide range operation such as changes in the VCO loop gain and jitter, non-constant charge pump current, etc. However, the specifications of each component can be set at this step, and the closed-loop performance is evaluated at system-level where only behavioral models of the PLL components are used. Next, if the specifications are not met, the designer needs to go back to the initial models and place more stringent specifications are met we can proceed with the circuit design of the individual blocks of the PLL using the specifications set in the initial behavioral models as a design budget. Transistor-level simulations of each component



Figure 4.1: Flow chart of design methodology of wide-range PLL



Figure 4.2: Schematic view of the PLL frequency synthesizer

of the PLL are performed at open-loop, and the performance data is extracted and incorporated in the behavioral models of the PLL components. The augmented models must incorporate the effect of wide range operation on the performance of the individual building blocks. Next, the closed-loop performance of the wide range PLL is evaluated at the system-level using the new behavioral models. If some specifications are not met, the designer can introduce modifications on one or more components, e.g. loop filter, at the system-level and then apply the same modification at the transistor-level of the component design. Once all the specifications are met, one can finally proceed to the silicon prototype implementation.

## 4.2 PLL Building Blocks for Wide-Range Operation

The overall architecture of the proposed integer-N PLL to cover the frequency range of 156.25 MHz to 10 GHz is shown in Fig. 4.2. Two VCOs with a buffer selector form the VCO bank which is used to generate an octave of frequency (5-10 GHz) which is further divided by two using a high-frequency prescalar to generate the frequency range (2.5-5 GHz). A chain of dividers is used with a multiplexer to generate the lower frequencies (156.25 MHz - 2.5 GHz). The VCO bank generates the output frequency range over 16 different bands using the binary control digits  $B_0$  to  $B_3$ . In order to be able to predict the performance of the PLL, each component needs to be designed at the transistor-level where parameters such as jitter and operating range are extracted. Behavioral models that describe the individual components and take into account the parameters extracted from the transistor-level simulations are developed. The components are then assembled



Figure 4.3: (a) Complementary CMOS LC-VCO and (b) equivalent small-signal model

and the overall performance of the PLL is evaluated.

#### 4.2.1 Wide Tuning-Range VCO

The VCO part of a frequency synthesizer PLL determines the maximum operating frequency and the frequency range of operation. Ensuring wide frequency-range operation of the VCO can minimize the number of VCOs required in the design of the PLL. LC tank VCOs can operate at high oscillation frequencies while providing superior noise performance compared to other topologies. Optimizing LC-VCOs for wide tuning range will provide the solution for desired low-noise and wide-range performance. A complimentary CMOS LC-VCO circuit implementation is shown in Fig. 4.3 with equivalent small-signal model.

#### • VCO Tuning-Range Optimization

The tuning range metric (TR) is defined as:

$$TR = 2 \times \frac{f_{max} - f_{min}}{f_{max} + f_{min}} \%$$
(4.1)

where  $f_{max}$  and  $f_{min}$  are the maximum and minimum oscillation frequency, respectively.

By lumping the load capacitance  $C_L$  and the parasitic capacitance  $C_{par}$  in Fig. 4.3 into one capacitance  $C_{fix}$ , which represents the fixed capacitance, we can rewrite Eq. (3.6) as

$$2C_{tank} = C_v + C_{fix} . aga{4.2}$$

Thus, the tuning range can be written as:

$$TR = 2 \times \frac{\sqrt{\frac{C_{v,max}}{C_{v,max}} - \frac{C_{fix}}{C_{v,min}}} - \sqrt{1 + \frac{C_{fix}}{C_{v,min}}}}{\sqrt{\frac{C_{v,max}}{C_{v,max}} + \frac{C_{fix}}{C_{v,min}}} + \sqrt{1 + \frac{C_{fix}}{C_{v,min}}}} \%$$

$$(4.3)$$

Therefore, the tuning range can be maximized by increasing the varactor ratio  $\frac{C_{v,max}}{C_{v,min}}$ and minimizing the fixed capacitance  $C_{fix}$ .

The LC-VCO phase noise in the  $1/f^2$  region at an offset frequency  $f_m$  can be expressed using Hajimiri's model as

$$\mathcal{L}\{f_m\} = \frac{1}{8\pi^2 f_m^2 C_{tank}^2 V_{max}^2} \sum_n \left(\frac{i_n^2}{f_m} \Gamma_{rms,n}^2\right) \,. \tag{4.4}$$

Minimizing the tank losses, by maximizing the quality factor Q of the inductor and varactor, reduces the active conductance  $g_{act}$  required to compensate for the tank losses. This reduces the noise terms  $i_n^2/f_m$  and hence improves the phase noise. In addition, less power is consumed since the required transistor transconductance is reduced. This results in smaller transistors widths and hence smaller fixed capacitance  $C_{fix}$ , which consequently increases the tuning range.

#### Passive components

Designing high-Q inductors and varactors is of paramount importance as it results in lower phase noise, less power consumption, and wider tuning range. The quality factor  $Q_L$  of an inductor L [55] can be expressed as

$$Q_L = \frac{L/C_P (1/\omega C_P - \omega L) - (R_s + R_e)^2 / \omega C_P}{[(R_s + R_e)^2 / \omega C_P]^2}$$
(4.5)

where the term  $C_P$  represents the parasitic capacitance associated with the dimensions of the inductor,  $R_s$  and  $R_e$  are the parasitic resistances associated with metal loss and eddy loss, respectively.

If the inductor losses are lumped into one equivalent parallel conductance  $g_L$ , then we can write:

$$g_L \cong \frac{1}{Q_L \omega L} \tag{4.6}$$

In general, the quality factor decreases as the value of the inductor increases. This was verified by optimizing different inductor values for maximum quality factor. Fig. 4.4 shows the maximum quality factor obtained versus the inductor value for different frequencies.

Nonetheless, Eq. (4.5) indicates that the quality factor of the inductor is a function of the inductor value as well as some other parasitic dimension-independent parameters. Eq. (4.6) indicates that the term  $LQ_L$  determines the value of  $g_L$  of an inductor at a given frequency. Both  $LQ_L$  and  $g_L$  are plotted versus the inductor value for different frequencies, as shown in Fig. 4.5. Therefore, although the quality factor degrades when larger inductor values are realized, the term  $LQ_L$  increases causing the equivalent parallel conductance to decrease which results in less inductor losses for larger inductor values.

On the other hand, there are various ways to implement a varactor, such as pn junction, inversion-mode MOS, and accumulation-mode MOS. The accumulation-mode varactor exhibits the best phase noise performance among the aforementioned types [56]. The accumulation-mode varactor can be modeled [57] as shown in Fig. 4.6.

 $R_p$  is the polysilicon gate resistance, while  $R_{ch}$  is the channel resistance.  $L_p$  is the gate inductance. More often than not, the varactor is partitioned into fingers  $(N_f)$  and segments  $(N_s)$  to reduce  $L_p$  and  $R_p$  so that they are negligible to  $R_{ch}$ . Therefore, the equivalent series resistance  $R_s$  can be given by:

$$R_s \cong \frac{\alpha}{V_g - V_{g_0}} \frac{L}{W N_s N_f} \tag{4.7}$$

where  $\alpha$  and  $V_{g_0}$  are proportionality constants.

The combination of the series capacitance  $C_s$  and the parasitic fringing capacitance

108



Figure 4.4: Maximum obtained quality factor vs. inductor value



Figure 4.5:  $LQ_L$  and  $g_L$  vs. inductor value

 $C_f$  form the total capacitance of the varactor  $C_v$ . The series capacitance  $C_s$  is responsible for the varactor tuning range, and it is given by:

$$C_s = \frac{C_{ox}C_{dep}}{C_{ox} + C_{dep}} LWC_{ox}N_sN_f$$
(4.8)



Figure 4.6: A cross section of an accumulation-mode varactor and its equivalent model

On the other hand,  $C_f$  is the parasitic fringing capacitance that limits the varactor range, and is given by:

$$C_f = \beta W N_s N_f \tag{4.9}$$

where  $\beta$  is a proportionality constant.

The quality factor  $Q_v$  of a varactor can be approximated by:

$$Q_v = \frac{\omega L_g - 1/\omega C_v}{R_s} \cong \frac{1}{\omega (C_s + C_f)R_s}$$
(4.10)

The equivalent conductance loss can be given by:

$$g_v = \frac{\omega C_v}{Q_v} \tag{4.11}$$

It can be seen from Eqs. (4.8) and (4.9) that increasing W,  $N_s$ , or  $N_f$  increases both  $C_s$  and  $C_f$ , and consequently results in minimal change in the tuning range. On the other hand, increasing L increases  $C_s$  without affecting  $C_f$ . Nevertheless, increasing L increases both  $R_s$  and  $C_s$  which results in severely deteriorating  $Q_v$  as can be seen in Eq. (4.10). Therefore, the varactor finger length sets the main trade-off between  $C_{v,max}/C_{v,min}$  ratio and quality factor to optimize the varactor. Fig. 4.7 shows the tuning range and the minimum quality factor of a varactor ( $W = 1 \ \mu m$ ,  $N_s = N_f = 4$ ) in a 65 nm CMOS process measured at 10 GHz. It should be also noted that the minimum  $Q_v$  will occur when  $V_{CTRL}$  is minimum. In this case, although the series resistance  $R_s$  is minimized, the varactor operates in the strong accumulation region which results in high capacitance which reduces the quality factor as can be seen from Eq. (4.10).



Figure 4.7: Minimum  $Q_v$  and  $C_{v,max}/C_{v,min}$  vs L



Figure 4.8: Equivalent input conductance of a cross-coupled pair

#### Cross-coupled transistor pair

The active transconductance  $g_{act}$  provided by each cross-coupled pair is equal to the equivalent negative conductance looking into the drain terminals as shown in Fig. 4.8.

The active transconductance  $g_{act}$  can be expressed as:

$$-g_{act} = \frac{1}{2} \left[ g_m (1 - j \frac{f}{f_T}) - g_{ds} \right]$$
(4.12)

112

where  $f_T = g_m/C_{gs}$  is the transistor transitional frequency.

For  $f \ll f_T$ , the active transconductance reduces to  $(g_m - g_{ds})/2$ . For higher frequencies (above few GHz), the degradation of the active transconductance due to the low impedance created by  $C_{gs}$  should be considered.

#### Optimization for maximum tuning-range

First, the dc biasing at the different nodes of the VCO should be decided. In our case, a 65 nm CMOS technology is used with a maximum supply voltage of 1.2 V. The output voltage is set at 0.5 V, while the source of the PMOS cross-coupled pair is set at 1.0 V. Minimum channel-length is selected for the cross-coupled transistors to minimize thermal noise contribution. The normalized-to-width parameters  $g_m/W$ ,  $g_{ds}/W$ ,  $g_{act}/W$ ,  $I_d/W$ , and  $C_{par}/W$  are evaluated under these dc biasing conditions as suggested in Section 2.4.2 [58]. From Eq. (4.12),  $g_{act} = (g_m - g_{ds})/2$ . The extracted values are shown in Table 4.1 for NMOS and PMOS transistors.

The same current flows in the NMOS and the PMOS transistors in each branch when the output differential voltage is zero. Therefore, the transistor widths ratio  $W_p/W_n$  is set to 2.55.

The design procedure is shown in Fig. 4.9. The procedure starts with a targeted maximum oscillation frequency  $f_{max}^T$ . The load capacitance due to the connected buffer(s) is chosen. Next, the passive components are designed. The inductor value is chosen, and the inductor is optimized for maximum quality factor at this value. The minimum varactor value  $C_{v,min}$  and the varactor ratio  $C_{v,max}/C_{v,min}$  are chosen, and the varactor is optimized to yield the maximum Q for that ratio. The losses equivalent conductance

| 14010 4.1.  | $V_{GS} = V_{DS} = 0.8$           | J V                               |  |
|-------------|-----------------------------------|-----------------------------------|--|
|             | NMOS                              | PMOS                              |  |
| $g_m/W$     | $866~\mu{\rm S}/\mu{\rm m}$       | $341~\mu\mathrm{S}/\mu\mathrm{m}$ |  |
| $g_{ds}/W$  | $120~\mu\mathrm{S}/\mu\mathrm{m}$ | $63.4 \ \mu S/\mu m$              |  |
| $g_{act}/W$ | $373~\mu\mathrm{S}/\mu\mathrm{m}$ | 139 $\mu S/\mu m$                 |  |
| $I_d/W$     | $97~\mu\mathrm{A}/\mu\mathrm{m}$  | $38~\mu\mathrm{A}/\mu\mathrm{m}$  |  |
| $C_{par}/W$ | $1.51~{\rm fF}/{\rm \mu m}$       | $1.56~{\rm fF}/\mu{\rm m}$        |  |

Table 4.1: Normalized parameters for  $V_{GS} = V_{DS} = 0.5$  V



Figure 4.9: Design procedure to optimize the VCO tuning range

of the inductor and varactor is evaluated, and the tank losses equivalent conductance can be expressed as:

$$2g_{tank} = g_L + g_v \cong \frac{1}{Q_L \omega L} + \frac{\omega C_v}{Q_v}$$

$$\tag{4.13}$$

After choosing a sufficient oscillation factor,  $g_{act} = g_{act,n} + g_{act,p}$  is calculated from Eq. (3.8). The normalized active transconductance provided by the two cross-coupled pairs  $g_{act}/W_n$  is equal to  $g_{act,n}/W_n + (W_p/W_n)(g_{act,p}/W_p)$ .

The widths of the cross-coupled transistors are determined by dividing the calculated  $g_{act}$  by the normalized  $g_{act}$ . The bias current is then calculated by multiplying the transistor width by  $I_d/W$ . The parasitic capacitance  $C_{par}$  is calculated by multiplying the transistor width by  $C_{par}/W$ . The resulting maximum oscillation frequency  $f_{max}^R$  is then calculated. If  $f_{max}^R < f_{max}^T$ , increase the initial targeted value of  $f_{max}^T$ .

The combination of the inductor L and  $C_{v,min} + C_L$  dictates the maximum oscillation frequency. To maximize the tuning range, start with minimum L to maximize  $C_{v,min}$ , and thus reduce the  $C_{fix}/C_{v,min}$  ratio in Eq. (4.3). If the resulting current exceeds the power consumption budget, the value of L is increased gradually at the expense of reduced tuning range and higher phase noise.



Figure 4.10: Tuning-range and power vs. inductor value for different VCOs

Note that for low oscillation frequencies (<5 GHz)  $2g_{tank} \cong g_L = 1/Q_L \omega L$  and the tank losses are dominated by the quality factor of the inductor, whereas for high oscillation frequencies (>20 GHz)  $2g_{tank} \cong g_v = \omega C_v/Q_v$  and the tank losses are dominated by the quality factor of the varactor.

Appendix A provides further details on using the simulator to extract the losses of the LC-VCO passive components, the negative transconductance provided by the active components, and the oscillation factor  $k_{osc} = g_{act}/g_{tank}$ .

#### Design example

To demonstrate how to design a wide tuning range VCO, a VCO in the range of 6 to 10 GHz is designed. First the dc biasing is set. The output voltage is set at 0.5 V, while the source of the PMOS cross-coupled pair is set at 1.0 V. Minimum channel length transistors are used to minimize thermal noise. Therefore, the transistors parameters of Table I can be used. A bank of three varactors, that creates eight bands of operation, is used to reduce the VCO gain. The three varactors are binary weighted and can be turned on and off to switch between bands using a 1.5 V supply voltage to maximize the maximum allowable voltage of the varactors in this technology. The design steps were followed as suggested. Three VCOs were designed using similar layout and post-layout simulations were carried out. The three designs have the same minimum oscillation frequency and have the same  $k_{osc} \cong 3$  to ensure consistent comparison. As can be noted



Figure 4.11: Carrier frequency versus control voltage of the VCO



Figure 4.12: Micrograph of the fabricated wide tuning-range VCO

in Fig. 4.10, using smaller inductor resulted in wider tuning range, while the power consumption increased. The reason is because reducing the inductor value results in larger  $g_L$  as shown in Fig. 4.5. In addition, reducing the inductor value requires larger varactor value  $C_v$  for the same oscillation frequency which in turn increases  $g_v$  as shown in Eq. (4.11).

The design with the maximum tuning range was fabricated and the chip performance was measured using Agilent E4445A spectrum analyzer, and Agilent MXA N9020A signal analyzer for the phase noise measurements. The micrograph of the chip is shown in Fig.

| Reference       | Technology | Frequency range | TR (%)   | PN@1MHz             | Power           |  |
|-----------------|------------|-----------------|----------|---------------------|-----------------|--|
| Itereferice     | recimology | Frequency range | 110 (70) | $(\mathrm{dBc/Hz})$ | $(\mathrm{mW})$ |  |
| JSSC'07 [59]    | 90 nm      | 4.5-7.1 GHz     | 45       | -109                | 14              |  |
| 3556 01 [55]    | bulk CMOS  | 4.0-7.1 0112    | 40       |                     | 14              |  |
| ESSCIRC'00 [60] | 180  nm    | 5.6-7.3 GHz     | 26       | -117                | 2.4             |  |
|                 | SiGe       | 5.0-7.5 GHZ     | 20       | -117                | 2.4             |  |
| JSSC'03 [61]    | 130  nm    | 3.0-5.6 GHz     | 58       | -109                | 3               |  |
| 3550 03 [01]    | SOI CMOS   | 5.0-5.0 GHZ     |          |                     |                 |  |
| CSIC'06 [62]    | 90 nm      | 9.3-10.9 GHz    | 16       | -109                | 7.5             |  |
|                 | bulk CMOS  | 9.9-10.9 GHZ    | 10       | -105                | 1.0             |  |
| JSSC'05 [63]    | 180 nm     | 6.2-9.1 GHz     | 38       | N/A                 | 14              |  |
| 3550 03 [03]    | bulk CMOS  | 0.2-9.1 0112    | 00       | 1 <b>\</b> / A      | 14              |  |
| This VCO        | 65  nm     | 5.6-10.2 GHz    | 58       | -83                 | 11              |  |
|                 | bulk CMOS  | 5.0-10.2 GHZ    | 50       | -00                 | 11              |  |

Table 4.2: Comparison between wide tuning range VCO designs

4.12. The frequency ranges for the eight different bands are shown in Fig. 10. The overlap between the bands in this design is about 40 MHz. The frequency ranges for the eight different bands are shown in Fig. 4.11. The resulting tuning range is between 5.6 and 10.2 GHz, which is about 58% according to Eq. (4.1). A comparison with other wide tuning range VCOs in the literature is demonstrated in Table 4.2. Due to some malfunction in the output buffer, the worst case phase noise at a carrier frequency of 9.16 GHz is -83 dBc/Hz and -91 dBc/Hz at 1 MHz and 10 MHz offset of a carrier frequency of 9.16 GHz, respectively. The design error was avoided later in the design of wide tuning-range VCOs of the PLL. Nevertheless, the concept of wide tuning-range optimization of the LC-VCO was successfully demonstrated.

#### • Design of VCO Bank for the Frequency-Range 5-10 GHz

In a general linear-model of a PLL, the VCO block is expected to generate an output frequency that is dependent on an input control voltage  $V_{CTRL}$ . In a VCO with a limited frequency range, the values of  $V_{CTRL}$  are usually limited to the range wherein the



Figure 4.13: Schematic of the VCO bank: Two VCOs with selector buffers

relationship between  $V_{CTRL}$  and the output frequency  $f_{out}$  is approximately linear, i.e.

$$f_{out} = K_{VCO} \cdot V_{CTRL} \tag{4.14}$$

where  $K_{VCO}$  is the VCO gain and is assumed to be constant.

Another important consideration in evaluating the performance of an oscillator is noise. Oscillator noise is usually viewed in the frequency domain as phase noise around the oscillation frequency. The LC-VCO phase noise  $\mathcal{L} \{\Delta f\}$  in dBc/Hz in the  $1/f^2$  region at an offset frequency  $\Delta f$  [38] can be expressed as

$$\mathcal{L}\left\{\Delta f\right\} = 10\log\left|\phi_{n,VCO}(\Delta f)\right|^2 \tag{4.15}$$

where

$$|\phi_{n,VCO}(\Delta f)|^2 = \frac{1}{8\pi^2 \Delta f^2 C_{tank}^2 V_{max}^2} \sum_n \left(\frac{i_n^2}{\Delta f} \Gamma_{rms,n}^2\right) , \qquad (4.16)$$

 $V_{max}$  is the maximum voltage swing of the tank,  $C_{tank}$  is the output capacitance of the tank,  $\Gamma_{rms,n}$  is the root-mean-square value of the impulse sensitivity function (ISF) due

to each noise source, and the  $i_n^2/\Delta f$  terms represent the noise spectral density due to transistor drain current and tank losses.

In time domain, oscillator jitter is usually characterized by cycle (or period) rms jitter  $\sigma_c$  [19]; that is the variance of the time period  $T_n$  to the average time period  $T_{avg}$  and is given by

$$\sigma_c = \sqrt{\lim_{N \to \infty} \left[ \frac{1}{N} \cdot \sum_{n=1}^{N} (T_n - T_{avg})^2 \right]}.$$
(4.17)

In the absence of flicker noise, the cycle jitter and the phase noise are related [18], [19] as follows:

$$\sigma_c = \sqrt{|\phi_{n,VCO}(\Delta f)|^2 \frac{\Delta f^2}{f_o^3}}$$
(4.18)

where  $\phi_{n,VCO}^2(\Delta f)$  is the phase noise magnitude in rad<sup>2</sup>/Hz at an offset frequency  $\Delta f$  from the oscillation frequency  $f_0$ .

In a PLL that operates over a limited frequency range, the phase noise, and thus the timing jitter, is assumed to be the same over the entire range of the input control voltage, i.e.

$$\sigma_c = \sigma_{c_0} \tag{4.19}$$

where  $\sigma_{c_0}$  is constant and is independent of oscillation frequency, and control voltage.

While these assumptions may be acceptable in a PLL that operates over a limited tuning-range, the variations in the VCO gain and jitter pose a great challenge in predicting the performance of a wide-range PLL where multiple frequency bands and multiple VCOs are possibly needed. Some techniques were proposed in the literature to linearize the VCO and reduce the jitter variations [64], [65]. However, this comes most of the time at the expense of reduced frequency range and extra circuitry. Therefore, a practical solution is to allow these variations in the VCO performance while accounting for them in the behavioral model of the VCO to ensure accurate prediction of the overall PLL performance [66]. To account for the variations in the VCO gain and the noise, both the output frequency and the cycle jitter are expressed in each frequency band as high-order non-linear functions of the control voltage, i.e.

$$f_{out} = a_0 + a_1 V_{CTRL} + a_2 V_{CTRL}^2 + a_3 V_{CTRL}^3 \dots$$
(4.20)



Figure 4.14: Block diagram of the wide-range VCO modeling procedure

and

$$\sigma_c = b_0 + b_1 V_{CTRL} + b_2 V_{CTRL}^2 + b_3 V_{CTRL}^3 \dots$$
(4.21)

where  $a_i$  and  $b_i$  are the high-order polynomial regression coefficients.

The cycle jitter causes variation  $\Delta T$  in the time period of the oscillation  $T = 1/f_{out}$ , thus dithering the output frequency i.e.  $\hat{f}_{out} = 1/(T + \Delta T)$ . As shown in Chapter 3, for a square waveform the time-period variation  $\Delta T$  is related to the cycle jitter  $\sigma_c$  as  $\Delta T = \sqrt{2}.\sigma_c.\delta$ , where  $\delta$  is a zero-mean unit-variance Gaussian random process. Thus, the dithered frequency becomes

$$\hat{f_{out}} = \frac{f_{out}}{1 + \sqrt{2}.\sigma_c.\delta.f_{out}}.$$
(4.22)

The procedure followed in accounting for wide-range variations in the VCO is shown in Fig. 4.14. The output frequency and the cycle jitter are calculated as functions of the input control voltage and the frequency band using Eqs. (4.20) and (4.21), respectively. The calculated cycle jitter is added to the output frequency using a Gaussian random process using Eq. (4.22). The phase of the output waveform is generated from the dithered frequency  $f_{out}$  by integration process followed by a modulus operation to limit the generated phase to  $2\pi$ . A voltage square waveform is then generated from the phase by comparing the phase to two threshold values that represent the transitions in the output waveform, i.e. in this case  $\pm \pi/2$ .

In this design, two LC-tank VCOs are optimized for wide tuning-range to cover the frequency range 5-10 GHz [23]. The first VCO oscillates between 4.9-7.4 GHz, while the second VCO oscillates between 7.1-11 GHz when the control voltage  $V_{CTRL}$  is swept



Figure 4.15: The frequency bands of the VCO with respect to  $V_{CTRL}$ 

between 0.2 and 0.9 V. Some margin around the frequency limits and sufficient overlap between frequency bands were considered to account for process variations. Fig. 4.13 shows the schematic of the proposed VCO bank. The enable switch EN ensures that only one VCO and one selector buffer are working at a time to drive the load resistors.

The VCO bank operates in 16 different frequency bands that can be determined using the binary inputs  $B_0$ - $B_3$  as shown in Fig. 4.13. Fig. 4.15 shows the 16 frequency bands of the VCO with respect to  $V_{CTRL}$ . When covering such a wide range, variations in the VCO gain  $K_{VCO}$  and phase noise (and jitter) are inevitable. Fig.4.16 show the resulting variations in the VCO gain and the cycle jitter over the frequency bands of the VCO bank from transistor-level simulations. By using Eqs. (4.20) and (4.21), the proposed behavioral model of the VCO bank covers the desired frequency range incorporating variations in terms of VCO gain and jitter.

VCO noise dominates the out-of-band noise of the PLL. Thus, the PLL loop bandwidth should be large enough to suppress the noise in the vicinity of the output oscillation frequency. The phase noise corner frequency is around 400 kHz for the two VCOs in this design. Thus, the phase noise in the  $1/f^3$  can be neglected if the PLL loop bandwidth remains above 400 kHz [19]. A behavioral model of the two VCOs, that is based on the procedure described in this section, is presented using Verilog-A language in Listing 4.1. The VCO module has two inputs: voltage control and frequency band code; and generates an output waveform whose frequency corresponds to the input. The module parameters such as the low and high voltage levels as well as simulation parameters such



Figure 4.16: Variations in (a) VCO gain and (b) cycle jitter with respect to frequency

as tolerance time and transition time are defined in the module.

Depending on the band code and the control voltage, the module decides which frequency band the VCO operates in. Based on which of the 16 bands shown in Fig. 4.16 is the operating frequency band, the oscillation frequency and the resulting cycle jitter are calculated. To account for the wide range variations in VCO gain and jitter, the oscillation frequency and the cycle jitter in each frequency band is described with respect to  $V_{CTRL}$  using a third order fitting curve. The timing jitter is added to the transitions in a similar way detailed in Listing 3.3.

#### 4.2.2 Reference Oscillator

The reference oscillator has a phase noise profile similar to that of a VCO. However, the oscillator noise contributes to the in-band noise of the PLL. It is very important to use a clean reference oscillator to ensure low noise contribution even at large frequency divider ratios. The 100 MHz off-chip crystal oscillator used to drive the input has a phase noise as low as -104 dBc/Hz at 100 Hz. The measured phase noise profile of the reference oscillator is shown in Fig. 4.17. If we assume a dominant  $1/f^2$  region in the vicinity of the oscillation frequency, then using Eq. (4.18) the cycle jitter is equal 0.6 fs.

The behavioral model of the reference oscillator is similar to that of a VCO but with a fixed-jitter single oscillation-frequency and without the dependence on an input control voltage or frequency bands. The behavioral model of the reference oscillator is shown in Listing 4.2.

```
'include "constants.vams"
'include "disciplines.vams"
module VCO_model(in, bits_freq, out);
input in,[0:3] bits_freq; output out;
electrical out, in, delta_in, [0:3] bits_freq;
parameter real Vlo = 0, Vhi = 1.1;
parameter real tt = 1e-13 from (0:inf);
parameter real jitter_En = 1 from [0:1];
//enable or disable VCO jitter
parameter real ttol = 1f;
real Kvco , freq , phase , dT , delta , prev ,
Vout, freq_band, jitter;
integer n, seed , fp ;
analog begin
  @(initial_step) begin
  seed =-561;
  fp = $fopen("~/periods_PLL.m");
  //{\rm save} periods to a matlab file
  Vout = Vlo;
\mathbf{end}
$discontinuity(0);
//check frequency band
freq_band=floor(V(bits_freq[0])+0.4)
+2*(floor(V(bits_freq[1])+0.4))
+4*(floor(V(bits_freq[2])+0.4))
+8*(floor(V(bits_freq[3])+0.4));
//curve fitting of frequency & jitter
case(freq_band)
0: begin
\operatorname{freq} = -3.85 \, \mathrm{e8} \, \mathrm{*V(in)} \, \mathrm{*V(in)} \, \mathrm{*V(in)}
+5.44 \,\mathrm{e8} *\mathrm{V(in)} *\mathrm{V(in)} + 1.75 \,\mathrm{e8} *\mathrm{V(in)} + 4.88 \,\mathrm{e9};
jitter = (-17.10 *V(in)*V(in)*V(in)+5.06
*V(in)*V(in)+15.44 *V(in)+17.36
*1E-15*jitter_En;
\mathbf{end}
15: begin
\mathbf{end}
endcase
// add phase noise
    freq = freq * (1 + dT * freq);
   V(out) \ll transition(Vout, 0, tt);
end
endmodule
```

Listing 4.1: Verilog-A model of the wide-range VCOs



Figure 4.17: Measured phase noise of the reference oscillator

```
'include "constants.vams"
'include "disciplines.vams"
module ref_osc(out);
output out;
electrical out;
parameter real freq=1 from (0:inf);
parameter real Vlo=-1, Vhi=1;
parameter real tt = 0.01/freq from (0:inf);
parameter real jitter = 0.6e - 15 from [0:0.1/freq);
integer n, seed;
{\bf real next} \ , \ dT;
analog begin
@(initial_step) begin
seed = 286;
next = 0.5/ freq + $abstime;
\mathbf{end}
@(timer(next)) begin
n = !n;
dT = jitter * rdist_normal(seed, 0, 1);
next = next + 0.5/freq + 0.707*dT;
end
V(out) \iff transition(n ? Vhi : Vlo, 0, tt);
\mathbf{end}
endmodule
```

Listing 4.2: Verilog-A model of the reference oscillator



Figure 4.18: Reference oscillator input buffer

#### 4.2.3 Input Buffer

The off-chip oscillator generates a 3.3 V square wave at 100 MHz. This signal needs to be converted down to the PFD voltage-level, i.e., 1.2 V and buffered before feeding the PFD. To achieve that, the level-shifter input buffer shown in Fig. 4.18 is used. The first inverter consists of a thick-oxide complimentary pair of transistors to handle the large input signal, while second inverter uses regular thin-oxide transistors. Both inverters operate from a 1.2 V supply. If not designed properly, the input buffer contribution to the in-band noise can be significant [67, 68]. Considering thermal noise only, the input-referred noise of each inverter stage is given by

$$|\phi_{n,buf}(f)|^2 = \frac{4kT}{(g_{m,N} + g_{m,P})^2} \left[ (\gamma g_{d_0})_N + (\gamma g_{d_0})_P \right]$$
(4.23)

where  $g_m$  is the transconductance of the transistor,  $g_{d_0}$  is the output conductance at zero drain-to-source voltage, and  $\gamma$  is the white noise gamma factor. Therefore, the widths of the input buffer transistors are made large to increase the transconductance of the transistors to ensure that the noise contribution of the buffer is negligible [68].

The jitter of the input buffer is characterized by the edge-to-edge rms jitter  $\sigma_{ee}$  [18] that relates the variance of the noise amplitude  $n_v(t)$  to the slew rate of the periodic signal at the threshold crossing  $t_c$  as

$$\sigma_{ee} = \frac{\sqrt{var(n_v(t_c))}}{dv(t_c)/dt}.$$
(4.24)



Figure 4.19: Edge-to-edge jitter of the input-buffer

```
'include "constants.vams"
'include "disciplines.vams"
module H_L_Level_Shifter(out, in);
input in; output out; electrical in, out;
parameter real Vlo=0, Vhi=1.1;
parameter integer dir=1;
// dir=1 for positive edge trigger
parameter real tt=1n from (0:inf);
parameter real td=0 from (0:inf);
parameter real jitter = 50e - 15;
// include edge-to-edge jitter
parameter {\bf real} ttol=10f ;
parameter real shifting_factor=3;
//Vout=Vin/shifting factor
integer count, n, seed;
real dt, Vout;
analog begin
 @(initial_step) seed =-311;
 @(cross(V(in)-(Vhi+Vlo)/2,dir,ttol))
    begin
    dt = jitter * rdist_normal(seed, 0, 1);
    \mathbf{end}
  Vout=V(in)/shifting_factor;
 V(out) <+ transition(Vout, td+dt, tt);
\mathbf{end}
endmodule
```

To extract the edge-to-edge jitter from the designed buffer, the simulator periodic steady state (PSS) noise analysis is used. To do so, the buffer is driven by a representative input, and the simulator computes the input to output delay variations as well as the output slew rate at each threshold crossing. If the noise voltage amplitude  $n_v(t_c)$  is defined as the difference between the jittery voltage amplitude at the time crossing  $t_c$ and a reference threshold voltage, the variance of the power spectral density  $|n_v(f, t_c)|^2$ of the noise amplitude at the threshold crossing points [18] is given by

$$var(n_v(t_c)) = \int_0^{f_{out}/2} |n_v(f, t_c)|^2 df$$
(4.25)

where  $f_{out}$  is the frequency at the output of the buffer, which in this case is equal to the reference oscillation frequency. Applying Eq. (4.24), the extracted edge-to-edge jitter is 50 fs. The extracted jitter should then be included in the behavioral model of the input buffer. Shown in Listing 4.3, the behavioral model of the input buffer divides the voltage-level by the shifting factor, and the edge-to-edge jitter is added at the threshold crossing of each rising edge of the output.

#### 4.2.4 Phase-Frequency Detector

The PFD needs to operate at a frequency as high as the reference frequency; that is 100 MHz. Dynamic-logic PFD architectures provide an attractive solution due to their high speed operation and dead-zone elimination. The dynamic-logic PFD shown in Fig. 4.20 was proposed in [39]. The circuit eliminates the dead-zone by ensuring that the outputs are directly used to reset the PFD without any intermediate logic.

Furthermore, the modification proposed in [69] ensures that the blind zone is eliminated by inserting a delay element in the path of the input signal. The modified PFD is shown in Fig. 4.21. The blind-zone occurs when an input rising-edge occurs while the PFD is in the reset mode which causes the PFD to fail to detect the transition. By creating a delayed version of the reference and feedback signals with a delay larger than the reset time, the delayed versions of the reference and feedback signals arrive after the reset operation is executed. This allows the PFD to properly allow the detection of the input rising edges.



Figure 4.20: Circuit diagram of a dynamic-logic PFD



Figure 4.21: Circuit diagram of the modified dynamic-logic PFD

### 4.2.5 Charge Pump

In a PLL that operates over a limited range, the CP operating range is limited and the up and down output currents of the CP are assumed to be equal and have approximately a constant value regardless of the output voltage over the specified range. In order to allow modeling of wide range operation of the CP, the model should incorporate the variation of the CP output current  $I_{CP}$  as well as the mismatch between the up-current  $I_{UP}$  and the down-current  $I_{DN}$  of the CP with respect to the output voltage.

The CP proposed in [70] is shown in Fig. 4.22. This structure utilizes the switch on-resistance as a degeneration resistor to increase the output impedance of the up and



Figure 4.22: CP schematic



Figure 4.23: Output CP current vs. output voltage

down current sources. A modification is applied to the CP proposed in [70] by using low threshold-voltage  $V_{TH}$  transistors  $M_{N_3}$  and  $M_{P_3}$  in the feedback loop extends the usable range to 0.2-0.9 V as shown in Fig. 4.23. The output resistance of the charge pump is

$$R_{out} \simeq g_{m_2} r_{o_2} r_{o_1} \times g_{m_3}(r_{o_3} || r_{o_4}) .$$
(4.26)

129

The charge pump output resistance reduces the effect of channel-length modulation without the use of extra cascoding which can limit the minimum power supply and the operating range.

In order to include the current variations in the behavioral model of the CP, the up-current  $I_{UP}$  and the down-current  $I_{DN}$  at the output of the CP need to be expressed as functions of the output voltage. To do so, each of the two currents is modeled as two linear segments: one in the saturation region and one in the triode region as shown in Fig. 4.23.

Thus, the up-current and the down-current are given by

$$I_{UP} = \begin{cases} I_{UP_{max}} - \frac{I_{UP_{max}} - I_{UP_{sat}}}{V_{DD} - |V_{D_{sat_P}}|} V_{out} & : V_{out} \le V_{DD} - |V_{D_{sat_P}}| \\ I_{UP_{sat}} - \frac{I_{UP_{sat}}}{V_{DD} - |V_{D_{sat_P}}|} (V_{out} - |V_{D_{sat_P}}|) & : V_{out} > V_{DD} - |V_{D_{sat_P}}| \end{cases}$$

$$I_{DN} = \begin{cases} I_{DN_{sat}} + \frac{I_{DN_{max}} - I_{DN_{sat}}}{V_{DD} - V_{D_{sat_N}}} (V_{out} - V_{D_{sat_N}}) & : V_{out} \ge V_{D_{sat_N}} \\ \frac{I_{DN_{sat}}}{V_{D_{sat_N}}} V_{out} & : V_{out} < V_{D_{sat_N}} \end{cases}$$

$$(4.27)$$

where  $V_{D_{sat_N(P)}}$  is the saturation voltage of the N(P)MOS current source transistor of the charge pump, and the parameters  $I_{UP_{max}}$ ,  $I_{UP_{sat}}$ ,  $I_{DN_{max}}$ , and  $I_{DN_{sat}}$  are defined as in Fig. 4.23. The linearized expressions are used in the behavioral model of the CP to account for  $I_{CP}$  variations. Based on whether the reference input or the feedback input is high, the current is pumped into or out-of the output node. The behavioral model should also account for the timing jitter that the CP adds to the overall PLL. Thus, the timing jitter is added at the crossing time of the output current waveform.

The CP is a major contributor to both the reference spurs and the in-band noise at the output of the PLL. The reference spurs occur at the output due to mismatch in the up and down currents. The extra charge supplied to the loop filter due to mismatch need to be compensated for at the next reference cycle edge, which results in voltage ripples on the VCO control voltage. The magnitude of the reference spurs in dBc with respect



Figure 4.24: Output current noise of the CP versus frequency

to the carrier in a second-order loop filter PLL is given by

$$P_s/P_c = 20 \log \left[ \frac{\Delta t_{RST}^2 \Delta I K_{VCO}}{4\pi C_2} \left(1 + \frac{\Delta I}{I_{CP}}\right) \right]$$
(4.29)

where  $\Delta t_{RST}$  is the PFD reset time,  $I_{CP}$  is the CP current,  $\Delta I$  is the CP current mismatch, and  $C_2$  is the second-pole capacitor of the loop filter.

The noise contribution of the CP is significant only during the locked state on-time; that is when both transistors are on. The input-referred noise of the CP is given by

$$|\phi_{n,CP}(f)|^{2} = \left(\frac{2\pi}{I_{CP}}\right)^{2} \left(\bar{i_{n,N}}^{2} + \bar{i_{n,P}}^{2}\right) \Delta t_{RST} \left(1 + \frac{\Delta I}{I_{CP}}\right) f_{ref}$$
(4.30)

where  $i_{n,N}^2$  and  $i_{n,P}^2$  are the noise sources that represent the up and down current sources of the CP, respectively.

Increasing the CP current can reduce both reference spurs and in-band noise contribution at the expense of increased power consumption. The charge pump current is designed to be around 1.2 mA for output voltage around the middle of the supply.

To extract the edge-to-edge jitter of the PFD/CP cascade, the PFD/CP is driven with two representative inputs with a fixed time delay and the PSS noise analysis is applied at the output current. Taking into account the loop filter attenuation at higher frequencies, the noise variance var(n) is calculated by integrating the output current noise spectrum density until the attenuated out-of-band noise is negligible. The output current noise over the frequency range of interest is shown in Fig. 4.24. Derived from Eq. (4.24), the edge-to-edge jitter of the PFD/CP is then given by

$$J_{ee,PFD/CP} = \frac{\sqrt{var(n_v(t_c))/2}}{I_{CP}f_{ref}}.$$
(4.31)

The extracted jitter of the PFD/CP is 144 fs. The behavioral model of the PFD/CP should include both the jitter effect and the mismatch effect. The Verilog-A model of the PFD/CP used in this design is given in Listing 4.4.

```
'include "constants.vams"
'include "disciplines.vams"
module PFD_CP (ref,vco,out);
input ref, vco; inout out;
electrical ref, vco, out;
parameter integer dir = +1;
parameter real vth = 0.55; // threshold
parameter real tt = 4e - 10 from (0:inf);
parameter real td = 1e-12 from [0:inf);
parameter real jitter = 144e - 15;
parameter real ttol = 1e-15 from (0:inf];
real iout, state, dt;
integer seed;
analog begin
       dt=jitter*$rdist_normal(seed,0,1);
        @(cross(V(vco)- vth,dir,ttol))
          begin
           if(state > -1)
           state = (floor(state + 0.5) - 1);
           if(V(out) < 0.2) begin
           iout = 6e - 3*V(out);
           \mathbf{end}
    else
    iout = 0.061 e - 3*V(out) + 1.188 e - 3;
        end
        @(cross(V(ref)- vth, dir,ttol))
            begin
            if(state < (1))
            state = (state+1);
            if(V(out) > 0.94) begin
            iout = -7.04e - 3*V(out) + 8.09e - 3;
            end
            else
            iout = -0.08e - 3*V(out) + 1.25e - 3;
        \mathbf{end}
I (out)<+
transition(iout*state,td+dt,tt,ttol);
end
endmodule
```
#### 4.2.6 Frequency Dividers

The divider ratio N in a wide-range PLL is a very important parameter because N can vary significantly over the operating range. In this design, N varies between 50 and 100. This corresponds to large variations in noise, phase margin, loop bandwidth and other loop dynamics of the overall PLL. The effect of these variations should be considered from the early stages to ensure accurate prediction of the performance of the wide-range PLL. A frequency divider model must also account for the jitter contribution on the overall PLL.



Figure 4.25: Six-bit programmable counter schematic



Figure 4.26: Control-logic circuit of the programmable counter

Several frequency dividers are needed in the proposed PLL architecture. A programmable frequency divider is used for frequency synthesis. Due to the complexity of its programming circuitry, the programmable divider speed is limited. The programmable divider is preceded by a fixed divide-by-2 prescalar to relax the maximum operating frequency requirement. In addition, a multiple of divide-by-2 blocks are used to generate the lower frequency bands as shown in Fig. 4.2, where a multiplexer is used to select the required band.

The six-bit programmable divider shown in Fig. 4.25 was used. Similar to the divide-by-N architecture discussed in Section 3.4.2, the divider consists of a cascade of six flip-flops (FFs) and a control logic circuitry to generate the timing signals required for making the proper transitions. As suggested in [71], the programmable divider shown in Fig. 4.25 includes several modifications on the conventional divide-by-N programmable divider. To ensure simpler circuit and enhance the speed of the front-end of the divider, the first flip-flop (FF<sub>1</sub>) is designed with no set or reset capabilities, whereas both FF<sub>2</sub> and FF<sub>3</sub> are designed with no reset capability. In addition, the programming control circuitry shown in Fig. 4.26 ensures that all the set and reset signals (except ST<sub>3</sub>) are active for two clock cycles while ST<sub>3</sub> is active for one clock cycle. This architecture has the capability to operate at high speed (above 5 GHz) and provide a wide range of division ratios, i.e, from 2 to  $2^{M}$ -1, where M is the number of stages.



Figure 4.27: CML-based prescalar



Figure 4.28: Edge-to-edge jitter of the frequency dividers

The prescalar is made of two cascaded D-latches (to form a D-FF) with the output of the second D-latch connected to the input of the first latch with reversed polarity as shown in Fig. 4.27. The use of current-mode logic (CML) allows extending the operating frequency above 10 GHz [45].

The fixed divide-by-2 blocks operating at lower frequencies are designed in a similar way using D-FFs. However, the D-FFs used in this case are based on CMOS logic to save power due to relaxed speed requirement.

The thermal and flicker noise of the transistors used in the dividers circuitry result in timing jitter at the output of the divider. The jitter from the prescalar and the programmable divider can be referred to the input of the PLL and represented as one jitter source  $\phi_{n,div}^2(f)$ . The edge-to-edge jitter of the frequency dividers can be extracted in a similar way to that performed with the input buffer. The edge-to-edge jitter at the output of the dividers is shown in Fig. 4.28. The extracted jitter from the cascade of the prescalar and the programmable divider is 252 fs. The behavioral model of a programmable divider must include both the variation in the divider ratio and the extracted edge-to-edge jitter [18]. The timing jitter is added at threshold crossing of each rising edge of the output waveform. The behavioral model of the programmable divider used in this design is described using Verilog-A language in Listing 4.5. The corner frequency of the flicker noise of the programmable divider is about 930 kHz which was added to the resulting phase noise spectrum to account for that extra noise source.

```
'include "constants.vams"
'include "disciplines.vams"
module Divider(out, in);
input in; output out; electrical in, out;
parameter real Vlo=-1, Vhi=1;
parameter ratio=50 from [2:inf);
//divider ratio
parameter integer dir=1;
parameter real tt=1n);
parameter real td=0;
parameter real jitter=252e-15;
//edge-to-edge jitter
parameter real ttol=10f;
integer count, n, seed;
real dt;
analog begin
@(initial_step) seed =-311;
@(cross(V(in)-(Vhi+Vlo)/2, dir, ttol))
  begin
  count = count + 1;
  if (count >= ratio)
     count = 0;
     n = (2 * count >= ratio);
     dt = jitter * rdist_normal(seed,0,1);
  \mathbf{end}
  V(out) <+ transition (n? Vhi: Vlo, td+dt, tt);
\mathbf{end}
endmodule
```

Listing 4.5: Verilog-A model of the programmable divider

### 4.2.7 Loop Filter

Since the loop filter is to be integrated on-chip, area is a major concern. Therefore, a second-order RC loop filter is used as shown in Fig. 4.29. The thermal noise of the resistor  $R_1$  is the only major contributor to the PLL noise. In order to find the noise contribution from the loop filter only, the charge pump and the VCO are both assumed to be open circuit. A noise current  $i_n$  is developed from the thermal noise voltage  $v_n$  associated with the resistor  $R_1$  [16]. The loop filter noise referred to the input of the

PLL is given by

$$|\phi_{n,LF}(f)|^2 = \left(\frac{2\pi}{I_{CP}}\right)^2 \frac{4kT \cdot (2\pi f)^2}{(2\pi f)^2 + \left(\frac{C_1 + C_2}{R_1 C_1 C_2}\right)^2}.$$
(4.32)

To ensure simplicity of the design, a loop filter with fixed component values is used to avoid switching between filters or components based on operating conditions. Therefore, the performance of the PLL using the selected loop filter components must be examined carefully over the entire operating range.

#### 4.2.8 Frequency Calibration

In order to reduce the VCO gain, a varactor bank was used in the LC tank of each VCO as shown in Fig. 4.13. Reducing the gain of the VCOs improves both the reference spurs attenuation performance, as suggested in implied in Eq. (4.29), as well as the phase noise of the PLL. This requires a VCO calibration circuit to set the coarse tuning of the varactor bank. The calibration method suggested in [72] is employed here. The circuit is shown in Fig. 4.30. The control voltage of the VCO bank is compared to reference voltages that represent the range within which  $V_{CTRL}$  is allowed to settle, i.e. in this case 0.2-0.9 V. If  $V_{CTRL}$  is out of this range, depending on whether  $V_{CTRL}$  is greater than the maximum threshold voltage  $V_H$  or lower than the minimum threshold voltage  $V_L$ , the logic counter is instructed to count up or down to switch between the bands until  $V_{CTRL}$  settles within the allowed range.



Figure 4.29: Loop filter noise

Wide range PLLs require the allowed settling range of  $V_{CTRL}$  to be also wide, which requires the threshold voltages to be near the supply rails. Therefore, different designs for the two comparators are used using Schmitt trigger topology to operate under the two extreme conditions and to account for the variations in  $V_{CTRL}$  before settling. The two circuits are shown in Fig. 4.31, where the hysteresis region is set by the sizing ratio of the load transistors and the cross-coupled transistors [73].

# 4.3 Design Optimization

There are several design parameters that need to be considered when optimizing the performance of a wide-range PLL. That includes noise, spurs attenuation, and phase margin.

Each PLL component contributes to the total noise at the output in a different way. The noise generated by the PFD/CP, frequency divider, and the loop filter can be referred back to the input, while the VCO noise is usually referred to its output. The transfer



Figure 4.30: Frequency calibration circuit for (a)  $V_H$  and (b)  $V_L$ 



Figure 4.31: Schmitt trigger comparators

function with respect to the input referred noise is given by

$$H_{in}(s) = \frac{\phi_{n,out}(s)}{\phi_{n,in}(s)} = N \frac{\frac{K_{VCO}K_P}{N}Z(s)}{s + \frac{K_{VCO}K_P}{N}Z(s)}$$
(4.33)

where  $s = j2\pi f$ ,  $K_{VCO}$  is the VCO gain in rad/s,  $K_P = I_{CP}/(2\pi)$  is the charge pump gain in A/rad, N is the frequency divider ratio where  $N = 2 \times M$ , and Z(s) is the loop filter impedance. It is important to note here that  $K_{VCO}$ ,  $K_P$ , and N are variable. The input-referred noise undergoes a low-pass filter as implied by Eq. (4.33), and appears amplified by the divider ratio at the output for frequencies less than the loop bandwidth. Thus, it is called in-band noise.

The transfer function with respect to the VCO noise is given by

$$H_{VCO}(s) = \frac{\phi_{n,out}(s)}{\phi_{n,VCO}(s)} = \frac{s}{s + \frac{K_{VCO}K_P}{N}Z(s)}.$$
(4.34)

The VCO noise undergoes a high-pass filter as implied by Eq. (4.34), and dominates the out-of-band noise; that is for frequencies higher than the loop bandwidth.

The total noise at the output of the PLL is the summation of the input-referred noise, scaled by the transfer function in Eq. (4.33), and the VCO noise scaled by the transfer function in Eq. (4.34); that is

$$|\phi_{n,out}(f)|^2 = |H_{in}(s)|^2 \cdot |\phi_{n,in}(f)|^2 + |H_{VCO}(s)|^2 \cdot |\phi_{n,VCO}(f)|^2$$
(4.35)



Figure 4.32: Simulated variations in PLL parameters.



Figure 4.33: Simulated noise contribution from different PLL components

where

$$|\phi_{n,in}|^2 = |\phi_{n,PFD/CP}|^2 + |\phi_{n,div}|^2 + |\phi_{n,buf}|^2 + |\phi_{n,LF}|^2.$$
(4.36)

The integral rms phase error (in degrees) at the output of the PLL is given by

$$\sigma_{\phi} = \frac{180^{\circ}}{\pi} \sqrt{\int_{f_{min}}^{f_{max}} 2|\phi_{n,out}(f)|^2} df .$$
(4.37)

which corresponds to an rms phase jitter (in seconds) given by:

$$\sigma_t = \frac{1}{f_o} \frac{\sigma_\phi}{360^\circ} \tag{4.38}$$

where  $f_o$  is the oscillation frequency.

In order to optimize the noise performance of the PLL, the 3-dB frequency of the PLL transfer function should be chosen at the cross point of the VCO output noise spectrum and the output-referred in-band noise spectrum [20]. If the 3-dB bandwidth is chosen narrower than the optimum value, the VCO noise will prevail; and if chosen larger than the optimum value, the in-band noise will dominate at the output of the PLL. The

choice of the bandwidth affects also the locking time and reference spurs attenuation [74]. Large bandwidth results in fast response and reduced locking time, but deteriorates the reference spurs attenuation. To further attenuate the reference spurs, an additional pole can be added to the loop, by increasing the loop filter to third order. All these trade-offs should be considered when optimizing the PLL performance. The design of wide-range PLLs poses an additional challenge. Since the loop dynamics vary substantially over the wide frequency range (due to variations in the divider ratio, the VCO gain, the charge pump gain, etc.), the performance metrics need to be evaluated over the whole range to ensure that the performance is met. A sufficient phase margin over the entire range is needed to ensure the stability of the loop.

Parameters such as phase margin, loop bandwidth, and reference spurs attenuation can be evaluated in MATLAB by including the variations in the VCO gain and divider ratio in the loop transfer function. Fig. 4.32 shows the response over the entire frequency range and bands. The noise performance of the PLL is best examined using the behavioral models developed through this paper. The contribution of each block at the output of the PLL can be visualized in Fig. 4.33. Depending on the offset frequency from the carrier, different regions are dominated by different noise sources. In Fig. 4.33, region A is dominated by reference noise, region B is dominated by the thermal and flicker noise of the frequency dividers, and region C is dominated by the VCO noise. In addition to verifying the operation of the PLL over the entire range, the models are expected to predict the phase noise of the PLL in the different regions. The MATLAB code used to calculate the phase noise from the periods of the output signal of the PLL is described in Appendix B.

### 4.4 Measurement Results

To verify the methodology and the models presented in this chapter, the proposed widerange PLL was fabricated in a 65 nm general-purpose CMOS technology. Fig. 4.34 shows a microphotograph of the die. The PLL occupies an area of  $1.5 \times 1.5 \text{ mm}^2$  including the pads. The core VCOs cover a frequency range between 5 and 10 GHz, and are shown in the bottom right corner. Each VCO is placed in a p-type inside an n-well, and surrounded with two guard rings; one on the p side and one on the n-well to ensure maximum decoupling from substrate and digital noise. The digital components, i.e, PFD/CP, dividers, and MUX, are located to the left of the VCOs and each is surrounded with a guard ring as well. The loop filter is implemented on-chip and can be seen on



Figure 4.34: A microphotograph of the fabricated frequency synthesizer PLL

the top right corner. Output buffers are matched to 50  $\Omega$  to drive the high-frequency probes used in the measurements. On-chip decoupling capacitors are used extensively in free spaces around the chip to allow effective filtering of high frequency noise from the supplies.

Measurements were carried out on a Cascade Microtech probing station using SG Z-probes (up to 20 GHz) and Agilent MXA E4445A spectrum analyzer (up to 13.2 GHz). The measurements were performed by directly probing on the unbonded pads shown in Fig. 4.34. Fig. 4.35 shows the measurement set-up using the high-frequency probe. Fig. 4.36 shows the measured output spectrum and the simulated and measured phase noise at 8 GHz carrier frequency.



Figure 4.35: Measurement test-bench for high-frequency probing



Figure 4.36: Simulated and measured phase noise for 8 GHz carrier frequency.



Figure 4.37: Simulated and measured phase noise vs. carrier frequency

To examine the accuracy of the predictions obtained from the behavioral models, Fig. 4.37 compares the measured phase noise with the simulated results from the models for the core frequency range (5-10 GHz) at different offset frequencies from the carrier frequency. To ensure overall accuracy, the selected offset frequencies belong to regions dominated by different noise sources as shown in Fig. 4.33. The shadowed area in Fig. 4.37 represents the uncertainty range of the simulated phase noise from the PLL behavioral model without considering the variations in  $K_{VCO}$ , N, and cycle jitter. Depending on initial values of  $K_{VCO}$ , N, and cycle jitter selected in the design, the predicted phase noise can be anywhere in this range. The black dotted markers represent the simulated phase noise from the behavioral model of the PLL taking into account these variations. The phase noise of five different chips was measured. The red and blue markers represent the average and the median values of the measured phase noise at the different offset frequencies, respectively. In addition to its importance in ensuring the PLL stability and predicting other loop parameters, it is evident from Fig. 4.37 that the behavioral models provide a more accurate prediction of the noise performance of the PLL. In the case wherein no variations are considered, the phase noise error prediction can be quite significant. When the variations are considered, we notice that the measured data follows the trend predicted by simulations. The VCO generating the lower frequencies exhibited lower phase noise than expected by simulation which resulted in some discrepancies at 10 MHz and 20 MHz offset frequencies. This can be attributed mainly to process variations and measurement errors. The phase noise measured in the region dominated by the in-band noise, although follows the same trend, is higher than that predicted by simulations. This is likely to be due to process variations and contributions from supply and substrate noise which were unaccounted for. The average error in the measured phase noise when the variations are considered compared to simulations is 2.2 dB, 5.0 dB, 2.4 dB, 2.4 dB, and 2.2 dB at 100 Hz, 1 kHz, 1 MHz, 10 MHz, and 20 MHz offset frequencies, respectively.

For the completion of the measurement results, Fig. 4.38 shows measured phase noise as a function of the carrier frequency for frequency offsets of 10 kHz, 100 kHz, and 10 MHz over the entire range of the PLL (156.25 MHz - 10 GHz). The measured rms jitter, integrated between 100 Hz and 20 MHz for each carrier frequency, in seconds is shown in Fig. 4.39. Fig.4.40 shows the measured reference spurs with respect to the carrier (in dBc) versus the carrier frequency. The measured spurs range is between -25 and -55 dBc.

Table 4.3 summarizes the PLL performance and compares it to other wide-range PLLs



Figure 4.38: Measured phase noise vs. carrier frequency

in the literature. The proposed PLL operates from a low supply voltage, i.e. 1.2 V, and consumes low power compared to the other wide-range PLLs. In addition, the proposed PLL has the largest loop bandwidth which implies having the fastest switching response. The proposed PLL uses CMOS technology to cover a continuous frequency range between 156.25 MHz and 10 GHz, and provide a noise performance that is comparable to that achieved by SiGe BiCMOS technology.



Figure 4.39: Measured integrated rms phase error in seconds vs. carrier frequency.



Figure 4.40: Measured reference spurs attenuation in dBc vs. carrier frequency.

| Reference                              |         | JSSC'10 [75]               | JSSC'11 [76]             | VLSI'14 [77]                               | TCAS'15 [78]            | This work                  |
|----------------------------------------|---------|----------------------------|--------------------------|--------------------------------------------|-------------------------|----------------------------|
| Technology                             |         | $0.25 \mu { m m}$          | $0.18 \mu { m m}$        | 65  nm                                     | 28  nm                  | 65  nm                     |
|                                        |         | SiGe BiCMOS                | SiGe BiCMOS              | CMOS                                       | CMOS                    | CMOS                       |
| Supply voltage                         |         | 2.5 V, 3.3 V, 5 V          | 1.8 V, 3.3 V             | 1.8 V                                      | 1.2 V                   | 1.2 V                      |
| Chip size                              |         | $4.0$ mm $\times$ $1.2$ mm | $2.1$ mm $\times 2.1$ mm | $0.32 \mathrm{mm} \times 0.22 \mathrm{mm}$ | $0.03 \text{ mm}^2$     | $1.5$ mm $\times$ $1.5$ mm |
| Core frequency range                   |         | 20.4 - 27.6 GHz            | 4 - 8 GHz                | 2.7 - 7 GHz                                | 8 - 16 GHz              | 5 - 10 GHz                 |
| Total frequency range                  |         | 0.6-4.6, 5.1-6.9, 10.2-    | 0.125-32 GHz             | 2.7 - 7 GHz                                | 2 - 16 GHz              | 0.156-10 GHz               |
|                                        |         | 13.8, 20.4-27.6  GHz       | (continuous)             | (continuous)                               | (continuous $)$         | (continuous $)$            |
| Reference frequency                    |         | 100 MHz                    | $20 \mathrm{~MHz}$       | $54 \mathrm{~MHz}$                         | 22.6 MHz                | $100 \mathrm{~MHz}$        |
| Loop bandwidth                         |         | 10-200 kHz                 | 100  kHz                 | $350 \mathrm{~kHz}$                        | 1 MHz                   | 2.9-6.6 MHz                |
| Phase noise (output frequency= $f_c$ ) |         | $f_c = 3 \text{ GHz}$      | $f_c = 6 \text{ GHz}$    | $f_c = 7 \text{ GHz}$                      | $f_c = 8 \text{ GHz}$   | $f_c = 8 \text{ GHz}$      |
|                                        | @10 kHz | $-105 \mathrm{~dBc/Hz}$    | $-81 \mathrm{~dBc/Hz}$   | N/A                                        | N/A                     | $-90 \mathrm{dBc/Hz}$      |
|                                        | @1MHz   | -122  dBc/Hz               | $-117 \mathrm{~dBc/Hz}$  | $-108 \mathrm{~dBc/Hz}$                    | N/A                     | $-98 \mathrm{~dBc/Hz}$     |
|                                        | @10MHz  | -142  dBc/Hz               | $-140 \mathrm{~dBc/Hz}$  | N/A                                        | $-132 \mathrm{~dBc/Hz}$ | $-121 \mathrm{~dBc/Hz}$    |
| rms phase jitter                       |         | $0.28 \mathrm{\ ps}$       | 1.36  ps                 | $0.56/1.1 { m \ ps}$                       | < 0.68  ps              | $0.7 \mathrm{\ ps}$        |
| Reference spurs                        |         | <-70 dBc                   | <-70 dBc                 | N/A                                        | <-48 dBc                | -25 to -55 dBc             |
| Loop filter                            |         | on-chip                    | off-chip                 | on-chip                                    | on-chip                 | on-chip                    |
| Power consumption                      |         | $680 \mathrm{~mW}$         | $273 \mathrm{~mW}$       | $14 \mathrm{mW}$                           | $129 \mathrm{~mW}$      | 42  mW                     |

Table 4.3: Comparison between wide-range frequency synthesizer PLLs

# 4.5 Summary

An approach to design wide tuning-range frequency synthesizer PLLs was demonstrated. The design methodology is based on a top-down approach that bridges the high-level behavioral modeling of the PLL building blocks and the transistor-level design of each block. The suggested behavioral models capture the variations in the performance of the PLL building blocks and their effect on the overall performance of the PLL. The methodology is capable of predicting the noise performance and loop dynamics of the PLL, while avoiding lengthy and impractical brute-force closed-loop simulations of the PLL at transistor-level. To verify the design approach, an integer-N frequency synthesizer PLL that covers a continuous frequency operating range from 156.25 MHz to 10 GHz was designed, modeled, and fabricated in a 65 nm general-purpose CMOS technology. The measurement results from the fabricated chip are in accordance with the simulated results from the high-level behavioral models of the PLL building blocks.

# Chapter 5

# Frequency Synthesizer PLL for Ultra-Low-Voltage Operation

In accordance with the prediction of continuous downscaling of the supply voltage in CMOS technology, several attempts have been made to design frequency synthesizer PLLs operating from sub-1 V power-supply. It is highly desired to generate a wide range of frequencies from a single frequency synthesizer to save power and area. In general, PLLs tend to be one of the most power consuming components. Several PLLs that operate from a sub-1 V power-supply have been reported [79–83].

In this chapter, we extend the use of the top-down methodology to ultra-low-voltage PLLs to achieve low noise performance. We present an ultra-low-voltage ultra-low-power frequency synthesizer PLL that covers the frequency range from 860 MHz to 1.22 GHz and operates from a supply voltage of 0.55 V. Due to reduced power supply and relatively high threshold voltage, different techniques need to be employed to tackle these challenges.

# 5.1 Circuit Design of PLL Components

A general architecture of an integer-N PLL is shown in Fig. 5.1. In order to adapt the PLL design to sub-1 V operation, several low-voltage design techniques are employed. The voltage controlled oscillator (VCO) is based on a reduced version of the conventional LC-tank VCO that is made more adaptable to low-voltage operation. The charge pump (CP) uses a gate-switching based topology to provide well-matched currents over a wider



Figure 5.1: A general architecture of an integer-N PLL

range, while the phase-frequency detector (PFD) is a conventional architecture. The prescalar uses a dynamic-logic circuit to allow high speed operation and relax the speed requirement of the programmable divider. The design of the building blocks of the PLL is explained in details in this section.

#### 5.1.1 Ultra-Low-Voltage VCO

The VCO is a critical component in determining the performance of the PLL. As the power supply voltage decreases, the degradation in the VCO performance becomes more pronounced. The decision between using a ring-based VCO and an LC-based VCO must be made based on the application requirements in terms of noise, area and power. On one hand, ring VCOs offer an attractive solution in terms of area and power. The absence of passive components allows compact design of ring VCOs, while their easyto-achieve start-up condition allows much less power consumption than LC-VCOs. On the other hand, ring VCOs fail to offer a practical solution compared to LC-VCOs in applications such as RF communication systems where noise requirements are highly stringent. Several techniques were proposed to allow low noise along with low-voltage operation in LC-VCOs. The transformer feedback LC-VCO suggested in [84] replaces the inductor in a conventional LC-VCO with a transformer. That increases the output voltage swing by allowing the drain and the source of the cross-coupled transistors to swing beyond the supply rails, which in turns reduces phase noise. The class-C LC-VCO suggested in [85] improves the phase noise by ensuring saturation operation of the transistors while employing a large capacitor in the tail current. This architecture is also applicable to low-voltage VCO design [86].



Figure 5.2: Low-voltage LC-VCO schematic



Figure 5.3: Simulated phase noise of the low-voltage VCO for 1 GHz carrier frequency

In our design, we use an LC-VCO due to its superior noise performance compared to a ring-VCO-based architecture. The architecture shown in Fig. 5.2 was suggested in [87] to reduce phase noise through forward biasing of the bulk of the cross-coupled transistors



Figure 5.4: PFD architecture for low-voltage PLL

while working from a nominal supply voltage. The tail current transistor is replaced with a resistor to limit the supply current from reaching high values. This also has the advantage of reducing the noise in the  $1/f^3$  region. In addition to its improved noise performance, this architecture is a good candidate to ultra-low-voltage operation due to reduced threshold voltage of the transistors and reduced number of stacked transistors. An accumulation-mode MOS varactor is used to control the oscillation frequency of the VCO. The bulk of the transistors is biased through a 5 k $\Omega$  resistor to prevent any unexpected high current that may forward bias the junction. The VCO consumes 2 mW, and covers the frequency range of 0.86-1.22 GHz. The simulated phase noise profile of the VCO for 1 GHz carrier frequency is shown in Fig. 5.3. Simulated phase noise of the VCO at 1 MHz offset frequency from the carrier is -114 dBc/Hz, which corresponds to cycle jitter of 63 fs.

#### 5.1.2 PFD/CP

With the reference oscillator frequency used in this design as low as 10 MHz, a conventional PFD architecture based on two D-flip-flops and a delayed RESET AND-gate can work successfully. The schematic of the PFD is shown in Fig. 5.4.

The design of a low-voltage CP, however, poses a greater challenge. Since cascoding transistors should be avoided in low-voltage operation, gate-switched CPs are preferred. Selecting the proper CP current is very important. Small CP current results in increased



Figure 5.5: Low-voltage CP schematic



Figure 5.6: Mismatch between up and down currents of the CP in Fig. 5.5



Figure 5.7: Output current noise of the low-voltage PFD/CP

CP mismatch and noise contribution, while large current results in high power consumption and requires large transistors which drastically increases the switching time of the CP. In low-voltage operation, the CP usually dominates the in-band noise. The CP architecture, shown in Fig. 5.5, was proposed in [79]. The architecture is based on gateswitched CP, with the modification of adding two feedback transistors  $M_{Fn}$  and  $M_{Fp}$ between the control voltage  $V_{CTRL}$  and the gates of the current source transistors of the CP. This modification suggested in [79] reduces the mismatch between the charging and discharging currents, which widens the usable frequency tuning range as shown in Fig. 5.5. The CP output current is about 550  $\mu$ A in the mid-range of the supply voltage. The feedback transistors reduce the mismatch between the charging and discharging currents, thus widens the usable frequency tuning range as shown in Fig. 5.6 with a maximum mismatch of less than 15% over the output voltage range of 0.1-0.45 V.

The variance of the output current noise over the frequency range of interest is calculated by integrating the output current spectrum shown in Fig. 5.7. The extracted jitter of the PFD/CP is about 50 ps.



Figure 5.8: Prescalar architecture and schematic



Figure 5.9: Structure of the used 6-bit programmable counter

#### 5.1.3 Frequency Dividers

A frequency synthesizer PLL needs a programmable divider circuit to synthesize the required oscillation frequency at the output of the PLL. The divider chain in this design consists of a divide-by-2 prescalar followed by a 6-bit programmable counter. The prescalar circuit should work at high speed (above 1 GHz) and low-voltage conditions. The prescalar circuit shown in Fig. 5.8 was proposed in [80]. The circuit achieves high operating frequency by utilizing dynamic logic to charge and discharge the internal nodes during the precharge and evaluation periods. The circuit is applicable to low-voltage by using a maximum stack of three transistors and forward biasing the bulk of the PMOS transistors. The prescalar relaxes the speed requirement of the programmable counter which is usually speed-limited due to the complexity of its programming circuitry. The programmable counter used here is similar to that used in Chapter 4. The circuit is designed to operate at input frequency above 500 MHz.



Figure 5.10: Variance of the dividers noise amplitude  $n_v(t)$  versus frequency



Figure 5.11: Third-order loop filter

structure is shown in Fig. 5.9. The noise amplitude spectrum at the output of the dividers is shown in Fig. 5.10. The extracted jitter from the cascade of the two dividers is 3.8 ps.

#### 5.1.4 Loop Filter

The loop filter should be optimized for the desired PLL performance in terms of noise, locking time, reference spurs attenuation, and stability. Based on these parameters, one



Figure 5.12: PLL noise components

can decide for the loop filter topology and order [74]. In our design, a third-order RC off-chip filter is used. The filter design is shown in Fig. 5.11. The loop filter components must be selected such that the transitional frequency of the PLL is around the frequency value at which the noise spectral density of the VCO is equal to that of the in-band noise (mostly dominated by the CP). This approach optimizes the noise performance and ensures minimum jitter at the output of the PLL [20].

## 5.2 Noise Contribution

Since the PLL components do not show large variations over the frequency range covered by the PLL, simple behavioral models like the ones shown in Chapter 3 can be utilized. The reference oscillator used in this design generates a 10 MHz signal with very low jitter. The phase noise of the reference oscillator is as low as -145 dBc/Hz at 1 kHz offset from the carrier. Therefore, the reference oscillator contribution to the output noise is negligible. As simulations showed in the previous section, the input-referred jitter of the PFD/CP is the highest. Thus, the dominant source of in-band phase noise at the output of the PLL comes from the PFD/CP block. On the other hand, the out-of-band noise



Figure 5.13: Micrograph of the fabricated chip

is dominated by the VCO. The simulated total phase noise profile of the PLL is shown in Fig. 5.12. Region A in Fig. 5.12 is dominated by in-band PFD/CO noise, whereas region C is dominated by VCO noise. Phase noise in region B is a combination of the both PFD/CP noise and VCO noise.

# 5.3 Experimental Measurements

The circuit was fabricated in a 65 nm CMOS technology. The chip micrograph is shown in Fig. 5.13. The active area of the PLL is 1.2 mm<sup>2</sup>. The third-order RC filter was implemented off-chip. The measurements were performed using Agilent E4445A spectrum analyzer. The measured output spectrum at 1.2 GHz is shown in Fig. 5.14. Fig. 5.15 shows the simulated and measured phase noise of the PLL at 1.2 GHz. It can be seen that at 1.2 GHz the measured phase noise in the PFD/CP dominated region is slightly higher than that predicted from simulations. The discrepancy between simulated and measured results was higher for lower carrier frequencies. These deviations can be due to process variations, supply/substrate noise, and/or noise coupling between components that was not accounted for in noise simulation at the system-level. The PLL covers the



Figure 5.14: Measured output spectrum of the PLL at 1.2 GHz



Figure 5.15: Phase noise of the PLL at 1.2 GHz

frequency range from 860 MHz to 1.22 GHz, and consumes 3 mW dc power. Table 5.1 compares the PLL performance to other ultra-low voltage ultra-low power PLLs in the literature with supply-voltage less than 0.6 V. A figure-of-merit (FoM) that combines phase noise and frequency tuning range TR(%) is used to compare the different PLLs as defined in [88]

$$FoM = \mathcal{L}\left\{\Delta f\right\} - 20\log(\frac{f_0}{\Delta f} \cdot \frac{TR}{10}) + 10\log(\frac{P_{diss}}{1\text{mW}})$$
(5.1)

where lower FoM signifies better PLL performance.

The measurement results show that the achieved performance is competitive with the state-of-the-art ultra-low-voltage PLLs. The design in [81] achieves an excellent FoM based on Eq. (5.1). It should be noted, however, that this was achieved due to the emphasis on wide tuning-range using ring oscillator at the expense of deteriorated phase noise compared to other low-voltage PLLs. In this design, we aimed at achieving a balanced performance between phase noise and tuning range.

## 5.4 Summary

In this chapter, an ultra-low-voltage PLL that operates from a supply voltage of 0.55 V was presented. The design choices of the building blocks of the PLL were discussed. Bulk-biasing technique was exploited in the design of the VCO and the prescalar to allow sub-1 V operation. The design of the CP utilizes a gate-switched architecture with negative feedback to reduce mismatch between the up and down currents. To verify the validity of the design, the PLL was fabricated in a general-purpose 65 nm CMOS technology. The PLL covers a frequency range from 860 MHz to 1.22 GHz, and consumes 3 mW while operating from a 0.55 V supply. The measured phase noise at 1 MHz offset from a 1 GHz carrier is -107 dBc/Hz, and the rms jitter is 6.1 ps.

| Reference                | VLSI'07 [79]        | TCAS'09 [80]           | TCAS'11 [81]           | JSSC'12 [82]         | ASSCC'12 [83]       | This work           |
|--------------------------|---------------------|------------------------|------------------------|----------------------|---------------------|---------------------|
| Technology               | 180 nm              | 130 nm                 | 90 nm                  | 130  nm              | $65 \mathrm{nm}$    | 65  nm              |
| Supply voltage           | 0.5 V               | 0.5 V                  | 0.5 V                  | 0.5 V                | 0.5 V               | $0.55 \mathrm{~V}$  |
| Chip area                | $1.32 \text{ mm}^2$ | $0.04 \text{ mm}^2$    | $0.074 \text{ mm}^2$   | $0.074 \text{ mm}^2$ | $0.64 \text{ mm}^2$ | $1.2 \text{ mm}^2$  |
| VCO topology             | LC                  | Ring                   | Ring                   | Ring                 | LC                  | LC                  |
| Output frequency range   | 1.90-1.94 GHz       | 360-610 MHz            | 0.40-2.24 GHz          | 400-433 MHz          | $5.54~\mathrm{GHz}$ | 0.86-1.22 GHz       |
| Phase noise @1MHz offset | -120  dBc/Hz        | $-95 \mathrm{ dBc/Hz}$ | $-87 \mathrm{~dBc/Hz}$ | -92  dBc/Hz          | -105  dBc/Hz        | -105  dBc/Hz        |
| rms phase jitter         | N/A                 | 8.0 ps                 | $9.6 \mathrm{\ ps}$    | $5.5 \ \mathrm{ps}$  | N/A                 | $6.1 \mathrm{\ ps}$ |
| Reference spurs          | -44 dBc             | N/A                    | N/A                    | -38 dBc              | $-65 \mathrm{~dBc}$ | -30 dBc             |
| Power consumption        | 4.5  mW             | 1.25  mW               | 2.08  mW               | $440~\mu\mathrm{W}$  | 1.6 mW              | 3  mW               |
| FoM                      | -165                | -164                   | -174                   | -146                 | N/A                 | -167                |

Table 5.1: Comparison between ultra-low voltage PLLs in the literature

# Chapter 6

# Peripheral Circuits for Sub-1 V Operation

Both analog and digital parts of any IC chip require a stable and clean supply voltage. More specifically, analog and RF circuits are often sensitive to supply noise and fluctuations, which places stringent requirements on supply specifications. Therefore, linear drop-out (LDO) voltage regulators are often used in these applications. Operating from ultra-low supply voltages poses some extra challenges in the design of peripheral circuits such as op-amps, bandgap references, and voltage regulators.

A typical circuit that provides a stable supply voltage using an LDO voltage regulator is shown in Fig. 6.1. It consists of a PMOS pass-transistor and an error amplifier (EA) that ensure that the output voltage is a scaled value of the reference voltage. The reference voltage is usually generated from a temperature and supply insensitive circuit called bandgap reference (BGR).

In this chapter, we demonstrate the design and implementation of various building blocks for ultra-low voltage applications in a 65 nm CMOS technology. This includes op-amps, bandgap references, and voltage regulators. The proposed circuits combine several techniques to address the challenges that arise from sub-1V operation.

# 6.1 Ultra-low-Voltage Op-Amps

The op-amp is the most ubiquitous building block in analog circuits. Therefore, it is very crucial to design an ultra-low voltage op-amp that can be used as a building block in other peripheral circuits for sub-1V operation. The continuous downscaling of the



Figure 6.1: A conventional LDO voltage regulator

supply voltage and transistor channel-length in modern CMOS technologies has drastically deteriorated the performance of CMOS op-amps. Most notably, the device intrinsic gain, output voltage swing, and common mode input range were reduced. To increase the intrinsic gain, the vertical approach by using conventional cascoding techniques is no longer usable with nowadays low supply voltages. Instead, moving horizontally using multi-stage cascading seems inevitable despite the challenges in the required frequency compensation schemes to maintain stability. To improve the usable range at the input and the output, several low-voltage design techniques were proposed such as bulk-biasing, self-cascoding, floating-gate transistors, and voltage shifting [89, 90]. Some low-voltage analog circuit designs based on these techniques were reported [91–102].

In this section, we propose an ultra-low voltage operational-transconductance amplifier (OTA) that operates at supply voltage as low as 0.35 V while providing an acceptable performance. The low-voltage techniques deployed in the design are discussed in details. The proposed OTA combines two different ultra-low-voltage techniques to meet design requirements at the input stage. Mainly a pseudo-differential amplifier technique and bulk-driven MOS transistors are used to achieve rail-to-rail input-range with ultra-lowvoltage power-supply. A novel biasing technique is also proposed to enhance the performance of the OTA. Using the proposed technique eliminates the need for extra biasing circuitry and ensures robustness against process variations under ultra-low-voltage conditions. Furthermore, the proposed technique substantially enhances the common-mode rejection and power-supply rejection of the OTA.



Figure 6.2: Rail-to-rail input stage using complementary differential pairs

#### 6.1.1 Design of Input-Stage

The design of an input-stage with a rail-to-rail input common-mode range (ICMR) is a real challenge in low-voltage circuits. A common technique to achieve rail-to-rail ICMR utilizes complementary differential pairs as shown in Fig. 6.2. Over the full input common-mode supply range of 0 to  $V_{DD}$ , at least one of the two complementary differential pairs is always on to provide sufficient transconductance. Near the middle of the supply range, both differential pairs are fully on boosting the overall transconductance level. The minimum supply voltage using this architecture is expressed as  $V_{DD_{min}} = V_{GS_n} + |V_{GS_p}| + 2|V_{D_{sat}}|$ , where  $V_{GS_{n(p)}}$  is the gate-to-source voltage of the input N(P)MOS transistor and  $V_{D_{sat}}$  is the saturation voltage of the biasing current source of each differential pair. If the supply voltage drops below  $V_{DD_{min}}$ , both differential pairs may simultaneously turn off creating a "dead-zone" for values of input common-mode near the middle of the supply range [99]. Another suggested technique to achieve input rail-to-rail operation is through the use of depletion-mode transistors. Several designs that utilize an NMOS depletion-mode differential pair as an input stage have been reported [95, 99]. Nevertheless, depletion-mode transistors are not always available in a standard CMOS technology. A bulk-driven differential pair, however, offers an alternative to a gate-driven input stage. If the supply voltage for the circuit is less than a single p-n junction diode drop, a bulk-driven differential input-stage can operate over the entire input range of the circuit without the risk of forward-biasing this diode. The p-n junction is formed at the interface of the n-well and the source region of the PMOS transistor. Despite the low transconductance obtained from bulk-driven transistors, access to the bulk terminal is often available for at least the PMOS transistors.

In this section, we discuss two low-voltage techniques, namely the pseudo differential pair and the bulk-driven transistor. The first technique helps reduce the minimum limit on the operating supply voltage  $V_{DD_{min}}$ , while the second technique achieves the desired rail-to-rail operation at the input.

#### Pseudo Differential Pair

The pseudo differential pair, shown in Fig. 6.3(a), is similar in architecture to the conventional differential pair but has the tail current source removed [103], [104]. The resulting topology enables a differential stage with a maximum transistor stack-up of two. By eliminating the  $V_{Dsat}$  constraint of the tail current source, the minimum supply voltage is reduced while the input common-mode range and the output voltage swing are enhanced. The main drawback of this circuit is the severe deterioration in common-mode and power-supply rejection. In order to ameliorate this degradation, a common-mode feedforward (CMFF) replica circuit, shown in Fig. 6.3(b), can be used to sense the input voltage level and set the voltage at the gate of the current source load [104]. The circuit common-mode rejection ratio (CMRR) and power-supply rejection ratio (PSRR) will then be given by

$$CMRR = PSRR \simeq \frac{1}{2} g_{m_2}(r_{o_1} || r_{o_2}),$$
 (6.1)

where  $g_m$  is the transconductance of the transistor and  $r_o$  is its output resistance.

#### **Bulk-Driven MOS Transistor**

Similar to a conventional gate-driven MOS transistor, the gate-source voltage of a bulkdriven transistor is fixed to a level slightly above the threshold level to create a conducting channel between the source and drain regions. However, the channel is modulated by the ac input signal applied to the bulk terminal instead of the gate terminal as shown



Figure 6.3: Pseudo differential pair (a) without CMFF (b) with CMFF.



Figure 6.4: Bulk-driven PMOS transistor: (a) circuit operation (b) cross section.

in Fig. 6.4(a). Thus, the minimum input voltage is not limited by the threshold voltage of the transistor. This technique circumvents the threshold voltage requirement, thereby extending the allowable operating range [90, 91]. In many ways, the operating principle of a bulk-driven transistor is very similar to that of junction-field-effect-transistor (JFET) operating in its depletion mode. For illustration purposes, a cross section of the bulk-driven transistor is shown in Fig. 6.4(b).

Despite the elimination of the threshold requirement, the resulting structure suffers from several drawbacks compared to its gate-driven counterpart. First, the transconduc-


Figure 6.5: (a) Circuit schematic (b) Block diagram representation of proposed OTA

tance of the bulk-driven transistor  $g_{m_b}$  is considerably less than the transconductance of the gate-driven transistor, i.e.  $g_{m_b} \simeq \eta g_m$ , where  $\eta$  is between 0.2 and 0.4. Second, the transitional frequency of the bulk-driven transistor is lower than that of a gate-driven transistor which results in reduced speed and bandwidth. Third, the noise performance of the bulk-driven transistor is worse than that of a corresponding gate-driven transistor, mainly due to the lower transconductance of the resulting JFET structure. Finally, access to the bulk terminal is available in a standard N(P)-well process through the P(N)MOS transistors only. Care must be taken not to exceed the pn-junction voltage of about 0.6 V when the bulk is forward biased.

## 6.1.2 The Proposed Biasing Technique

Fig. 6.5 shows the proposed three-stage OTA without biasing and without frequency compensation. Fig. 6.5(a) shows the circuit schematic whereas Fig. 6.5(b) shows a block diagram representation of the amplifying stages. As depicted in Fig. 6.5(a), the OTA consists of three stages. The first stage uses a PMOS bulk-driven pseudo differential pair to simultaneously exploit the advantages of both techniques described in Section II. The second stage is a common source stage with a current mirror load, while the third stage is a common source stage with a current source load. The minimum supply voltage is  $V_{DD_{min}} = |V_{GS_1}| + V_{GS_2}$ . Thus, if the transistors operate in the subthreshold region, i.e.  $|V_{GS}| < |V_{TH}|$ , then  $V_{DD_{min}}$  can be even less than the sum of the threshold voltages of the NMOS and PMOS transistors. In this technology, the threshold voltage of the transistors used in the design is 0.3 V for both NMOS and PMOS transistors.

In Fig. 6.5, the biasing voltages  $V_{B_1}$ ,  $V_{B_2}$ , and  $V_{B_3}$  can be generated using a separate biasing circuit such as a current mirror or a voltage reference, which in this case corresponds to ac ground as shown in the equivalent ac representation in Fig. 6.6(a). An alternative biasing technique is proposed in Fig. 6.6(b), which does not require a biasing circuit to generate these node voltages. In this scheme, three short connections are applied. The first is from the output of the CMFF circuit  $v_{oFF}$  to  $v_{B_1}$ , the second is between  $v_{B_1}$  and  $v_{B_2}$ , and the third is between  $v_{out_1}$  and  $v_{B_3}$ .

The proposed biasing is applied to the OTA in Fig. 6.5(a). The implementation of the proposed biasing technique is shown in Fig. 6.7. The proposed OTA uses the CMFF circuit of the first stage to bias the gates of transistors  $M_1$ , and  $M_5$  and uses  $v_{out_1}$  to bias the gate of transistor  $M_8$  as indicated by the dashed lines in Fig. 6.7. Consequently, the OTA becomes self-biased without the use of any extra biasing circuitry. The proposed self-biasing technique results in 27% reduction in power consumption compared to a circuit which uses a separate biasing circuitry (assuming a 1:1 current mirror biasing and the availability of a constant reference current). In addition to saving area and power with this biasing approach, the OTA sensitivity to common-mode voltage, supply noise, and process variations is significantly reduced.

### Impact on Common-Mode Rejection

Assume that  $A_{DM_1}$  and  $A_{CM_1}$  are the differential-mode gain and common-mode gain of the first stage, and  $A_{DM_{FF}}$  and  $A_{CM_{FF}}$  are the differential-mode gain and common-mode



Figure 6.6: ac representation of OTA with (a) separate biasing (b) proposed biasing gain of the CMFF circuit. One can write



Figure 6.7: Implementation of the proposed biasing technique on the three-stage OTA.

$$v_{B_1} = A_{DM_{FF}} v_{in_{DM}} + A_{CM_{FF}} v_{in_{CM}}$$
(6.3)

where  $v_{in_{DM}} = v_{in}^+ - v_{in}^-$  and  $v_{in_{CM}} = (v_{in}^+ + v_{in}^-)/2$ .

By subtracting Eq. (6.2) from Eq. (6.3) and rearranging the result, one can express  $v_{B_1}$  as

$$v_{B_1} = v_{out_1} - (A_{DM_1} - A_{DM_{FF}})v_{in_{DM}} - (A_{CM_1} - A_{CM_{FF}})v_{in_{CM}}$$
(6.4)

If the CMFF circuit is designed such that  $A_{DM_1} \gg A_{DM_{FF}}$  and  $A_{CM_1} = A_{CM_{FF}}$ , then Eq. (6.4) is reduced to

$$v_{B_1} = v_{out_1} - A_{DM_1} v_{in_{DM}} \tag{6.5}$$

In differential-mode  $A_{DM_1}v_{in_{DM}} = v_{out_1}$  which reduces the term  $v_{B_1}$  to zero. Since  $v_{B_2}$  is shorted to  $v_{B_1}$ ,  $v_{B_2}$  is also equal zero. Therefore, the biasing of  $v_{B_1}$  and  $v_{B_2}$  remains at ac ground in response to differential input. In common-mode  $v_{in_{DM}} = 0$  which reduces Eq. (6.5) to  $v_{B_1} = v_{out_1}$ . By connecting the output of the first stage to both inverting and non-inverting terminals of the second stage in common-mode, the differential-mode gain of the second stage becomes zero and the common-mode signal is multiplied by the common-mode gain only. Further, by ensuring that the second stage is inverting in common-mode, the output of the first stage will have opposite polarity and can be canceled out by summing  $v_{out_1}$  and  $v_{out_2}$  in the third stage.

To illustrate the enhancement in the common-mode rejection capability of the self-



Figure 6.8: Equivalent circuit for common-mode (a) separate biasing (b) self-biasing

biasing property, a comparison of the amplifier of Fig. 6.5(b) with and without selfbiasing is performed. Specifically, Fig. 6.8 illustrates the common-mode half-circuit [105] equivalent representation for this amplifier with and without self-biasing. In the case of the half-circuit of the amplifier with separate biasing shown in Fig. 6.8(a), the gates of transistors  $M_1$ ,  $M_5$ , and  $M_8$  are connected to ac ground. Assuming that the currentmirror transistors  $M_4$  and  $M_6$  have the same aspect-ratio, the common-mode gain  $A_{CM}$  is found to be

$$A_{CM} = \frac{v_{out}}{v_{in,CM}} \simeq \left(\frac{g_{m_{b_1}}g_{m_3}g_{m_7}}{g_{m_2}}\right) (r_{o_5}||r_{o_6})(r_{o_7}||r_{o_8})$$
(6.6)

and the corresponding CMRR is

$$CMRR = \left|\frac{A_{DM}}{A_{CM}}\right| = \frac{1}{2}g_{m_2}(r_{o_1}||r_{o_2}) \tag{6.7}$$

In the case of self-biasing circuit of Fig. 6.8(b), the gates of transistors  $M_1$  and  $M_5$  are biased by the CMFF circuit. Thus, the gates of transistors  $M_1$ ,  $M_5$ , and  $M_8$  are effectively connected to  $v_{out_1}$  in response to common-mode signals. This results in significant improvement in the CMRR partly due to the reduced output impedance of transistor  $M_1$  of the first stage. However, most of the common-mode rejection occurs in the second and the third stages. The second stage acts as a pseudo-differential amplifier where the output of the first stage  $v_{out_1}$  is a common-mode input to transistors  $M_3$  and  $M_5$ . If transistors  $M_3$  and  $M_5$  are designed to have the same aspect ratio and, thereby realize the same transconductance, the differential gain will be zero while the common-mode gain can be approximated as  $-g_{m_3}/g_{m_4}$ . In the third stage, the inverted signal  $v_{out_1}$  and the non-inverted signal  $v_{out_2}$  are summed at the output node  $v_{out}$ . The total common-mode gain can be obtained by analyzing the equivalent small-signal for the circuit in Fig. 6.8(b) and is given by

$$A_{CM} \simeq \left(\frac{g_{m_{b_1}}}{g_{m_1} + g_{m_{b_1}} + g_{m_2}}\right) \left(\frac{g_{m_4}g_{m_8} - g_{m_3}g_{m_7}}{g_{m_4}}\right) (r_{o_7}||r_{o_8}) \tag{6.8}$$

and the CMRR is now given by

$$CMRR = \frac{1}{2} \frac{(g_{m_1} + g_{m_{b_1}} + g_{m_2})g_{m_3}g_{m_4}g_{m_7}}{g_{m_4}g_{m_8} - g_{m_3}g_{m_7}} (r_{o_1}||r_{o_2})(r_{o_5}||r_{o_6})$$
(6.9)

The expressions in Eqs. (6.8) and (6.9) show that the common-mode gain can be theoretically reduced to zero and the CMRR set to infinity if the condition  $g_{m_4}g_{m_8} = g_{m_3}g_{m_7}$  is met. The improvement in the CMRR is obtained by dividing the expression in Eq.(6.9) by that in Eq.(6.7), which results in

$$\Delta CMRR = \frac{(g_{m_1} + g_{m_{b_1}} + g_{m_2})g_{m_3}g_{m_4}g_{m_7}}{g_{m_2}(g_{m_4}g_{m_8} - g_{m_3}g_{m_7})}(r_{o_5}||r_{o_6})$$
(6.10)

It is interesting to evaluate the CMRR of these two amplifier configurations by using small-signal parameters listed in Table 6.1 on page 182. In the case of the self-biasing amplifier, the CMRR is expected to be 61 dB. In contrast, the expected CMRR for



Figure 6.9: Improvement in CMRR due to self-biasing technique

the amplifier with separate biasing is found to be 18 dB. This is a 43 dB improvement in CMRR using the self-biasing technique. This result is verified through a Spectre simulations of these two amplifiers. Despite the large improvement in the CMRR, Eq. (6.9) reveals the improvement comes about through a cancellation of two large quantities. Such a cancellation mechanism is known to be sensitive to transistors mismatch. Thus, a more realistic prediction of the CMRR enhancement must take into account the effect of mismatches. Running a Monte Carlo analysis with 10,000 cases, using the manufacturer statistical models of the transistors, indicates that the average value of the expected CMRR is 44 dB, and 99.7% of the samples will have a CMRR of no less than 30 dB under mismatch conditions; that is an improvement of at least 12 dB. Fig. 6.9 shows the CMRR improvement without mismatch and under different mismatch conditions. It is reasonable to conclude that a self-biasing circuit approach will enhance the CMRR of the amplifier.

### Impact on Power-Supply Rejection

A similar improvement in power-supply rejection is also achieved by the self-biasing technique. To illustrate this, consider the equivalent circuit shown in Fig. 6.10 for the



Figure 6.10: Equivalent circuit under supply noise (a) separate biasing (b) self-biasing

OTA subject to power supply noise. The inputs to the OTA are assumed to ac grounded under this analysis. In the case of separate biasing (Fig. 6.10(a)), the gates of transistors  $M_1$ ,  $M_5$ , and  $M_8$  are connected to ac ground. Transistor  $M_1$  acts as a common-gate amplifier to the supply noise, and the amplified noise propagates through to the second and third stage making this the dominant path for the supply noise. Assuming that the current-mirror transistors  $M_4$  and  $M_6$  have the same aspect-ratio, the noise gain from



Figure 6.11: Small-signal model of the circuit in Fig. 6.10(b).

the supply input to the amplifier output can be found to be

$$A_{dd} = \frac{v_{out}}{v_{dd}} \simeq -\left(\frac{g_{m_1} + g_{m_{b_1}}}{g_{m_2}}\right) g_{m_3} g_{m_7}(r_{o_5}||r_{o_6})(r_{o_7}||r_{o_8})$$
(6.11)

Consequently, the PSRR is found to be

$$PSRR = \left|\frac{A_{DM}}{A_{dd}}\right| = \frac{1}{2} \left(\frac{g_{m_{b_1}}g_{m_2}}{g_{m_1} + g_{m_{b_1}}}\right) (r_{o_1}||r_{o_2}) \tag{6.12}$$

In the case of self-biasing (Fig. 6.10(b)), the gates of transistors  $M_1$ ,  $M_5$ , and  $M_8$  are effectively connected to  $v_{out_1}$ . Fig. 6.11 shows an equivalent small-signal model of the circuit in Fig. 6.10(b). The second stage has two comparable input noise sources; one is from the supply line to  $v_{out_2}$ , and the second is from  $v_{out_1}$  to  $v_{out_2}$ . The two signals are correlated and have opposite polarities when added at node  $v_{out_2}$ . Another noise subtraction occurs in the third stage due to the opposite polarity between the signals  $v_{out_1}$  and  $v_{out_2}$ . The total supply noise gain can be obtained by analyzing the small-signal model in Fig. 6.11(b), and for  $g_{m_3} = g_{m_5}$  it is given by

$$A_{dd} = \left[ b.g_{m_7} - (1-a)g_{m_8} - \frac{1}{r_{o_8}} \right] (r_{o_7} || r_{o_8})$$
(6.13)

where  $a = (g_{m_1} + g_{m_{b_1}})/(g_{m_1} + g_{m_2})$  and  $b = 1 - a g_{m_3}/g_{m_4}$ .

The PSRR in this case is given by

$$PSRR = \frac{1}{2} \left( \frac{g_{m_{b_1}} g_{m_3} g_{m_7}}{b.g_{m_7} - (1-a)g_{m_8} - \frac{1}{r_{o_8}}} \right) (r_{o_1} || r_{o_2}) (r_{o_5} || r_{o_6})$$
(6.14)



Figure 6.12: Improvement in PSRR due to self-biasing technique

The improvement in PSRR is obtained by dividing the expression in Eq.(6.14) by that in Eq.(6.12), which results in

$$\Delta PSRR = \left(\frac{g_{m_1} + g_{m_{b_1}}}{g_{m_2}}\right) \left(\frac{g_{m_3}g_{m_7}}{b.g_{m_7} - (1-a)g_{m_8} - \frac{1}{r_{o_8}}}\right) (r_{o_5}||r_{o_6}) \tag{6.15}$$

The PSRR improvement can be maximized by setting the term  $[b.g_{m_7} - (1-a)g_{m_8} - 1/r_{o_8}]$  to be close to zero. By substituting the parameter values in Eq. (6.14) from Table 6.1 on page 182, the expected PSRR is 49 dB. Monte Carlo simulations show that under mismatch conditions the average PSRR is 47 dB, and 99.7 % of the samples will have a PSRR of no less than 33 dB; that is an improvement of at least 32 dB. Fig. 6.12 shows the PSRR improvement without mismatch and under different mismatch conditions. The design can be optimized for maximum CMRR or maximum PSRR depending on the application. In this particular design, reasonable rejection of both common-mode and supply noise is targeted.

#### Impact of Manufacturing Process Variations

In addition to enhancing CMRR and PSRR, the deployed technique also reduces the circuit sensitivity to process variations. In order to illustrate this, a Spectre simulation was performed on the two amplifiers with different biasing whereby the widths of the input transistors were varied between  $W_1/2$  and  $2W_1$ . Both the DC gain in dB and the output DC voltage were tracked. The results are shown in Fig. 6.13 where the left vertical axis lists the DC gain in dB and the right vertical axis lists the DC voltage in V. As is evident from the DC gain results, the self-biasing amplifier experiences only a  $\pm 2.2$  dB change with respect to changes in transistor widths whereas the separate biasing approach experiences catastrophic gain changes. Furthermore, the self-biasing amplifier output experiencing a minor  $\pm 80$  mV change with respect to its desired operating point of  $V_{DD}/2$  whereas the amplifier with separate biasing sees the output move over the full range of the supply voltage. Therefore, the self-biasing CMFF circuit will ensure that the effect of this variation is significantly minimized.

To illustrate the robustness of the proposed OTA to process variations in general, Monte Carlo analysis was run for 10,000 samples while the output dc voltage and dc gain of the OTA were observed. Fig. 6.14 shows the expected distribution of the two quantities. The output dc voltage has a normal distribution with an average value near mid-supply and standard deviation of less than 7 mV, whereas the dc gain has a normal distribution with an average of 46.7 dB and standard deviation of 0.4 dB only. The extra robustness introduced by the self-biasing technique is of paramount importance in lowvoltage design where the supply voltage level and the threshold voltage of the transistors are comparable and the the risk of circuit non-functionality is normally high.

### 6.1.3 Frequency Compensation

The proposed OTA uses a frequency compensation network based on the damping factor control (DFC) compensation scheme proposed in [106]. The implementation of the DFC compensation to the proposed OTA is shown in Fig. 6.15. The compensation network is composed of two nested Miller capacitors ( $C_{C_1}$  and  $C_{C_2}$ ) along with a damping factor control (DFC) stage  $G_{m_C}$ . The second stage is non-inverting, while the first stage and the third stage are inverting. The third stage needs to be inverting to ensure negative feedback in the inner loops. For that reason, the second stage of the OTA in Fig. 6.5 was loaded with a current mirror, whereas the third stage was loaded with a current source. The DFC compensation significantly enhances the bandwidth compared to conventional



Figure 6.13: DC gain and output DC voltage  $W_1$  for self-biasing and separate biasing



Figure 6.14: Output DC voltage and DC gain under process variations

nested Miller technique and requires smaller compensation capacitors. Simulations of both schemes show enhancement of 3.5 times in unity-gain frequency for the same phase margin (70°) and same power consumption (excluding the compensation stage), and driving the same load, i.e., in this case  $C_L = 3$  pF. The compensation capacitors needed in conventional Miller are  $C_{C_1} = 550$  fF and  $C_{C_2} = 5$  pF, as opposed to only 300 fF



Figure 6.15: A block diagram of the proposed OTA with frequency compensation.

for both  $C_{C_1}$  and  $C_{C_2}$  using DFC compensation. This also improves the large-signal behavior of the OTA since for the same current smaller capacitance will be charged or discharged resulting in faster slew rate, i.e.  $SR = I/C_C$ . In addition, the need for large transconductance in the third stage (and hence large current) to drive large capacitive loads is obviated compared to conventional Miller compensation [107]. A stack-up of only two transistors can be used to implement the transconductance block  $G_{m_C}$  which makes the DFC scheme applicable to low voltage applications.

### 6.1.4 OTA Design and Analysis

The full schematic of the proposed OTA with self-biasing and frequency compensation is shown in Fig. 6.16. The DFC block  $G_{m_C}$  is implemented using transistors  $M_{C_1}$  and  $M_{C_2}$  where the biasing of the gate of transistor  $M_{C_2}$  is provided by the CMFF circuit. Therefore, the circuit is self-biased and the need for extra biasing circuitry is eliminated.



Figure 6.16: Full schematic of the proposed OTA with self-biasing and compensation



Figure 6.17: Small-signal equivalent model of the OTA

The channel length of the transistors in the three amplifying stages is set to about 2.5 times minimum length to provide sufficient intrinsic gain. The transistor widths are set to provide the current level required to achieve the desired specifications.

### Voltage Gain and Bandwidth

The differential-mode dc gain  $A_{DM}$  of the proposed OTA can be obtained from the equivalent small-signal model in Fig. 6.17, and is given by

$$A_{DM} = \frac{v_{out}}{v_{in}} \simeq \frac{1}{2} g_{m_{b_1}} g_{m_3} g_{m_7}(r_{o_1} || r_{o_2})(r_{o_5} || r_{o_6})(r_{o_7} || r_{o_8})$$
(6.16)

The poles and zeros of the circuit can be obtained by further analyzing the smallsignal model. Assuming that  $g_{m_{C_1}}(r_{o_{C_1}}||r_{o_{C_2}})$  is greater than unity,  $C_{C_1} = C_{C_2}$ ,  $C_{p_2} << C_{C_2}$ , and the parasitic capacitances  $(C_{p_1}, C_{p_2}, \text{ and } C_{p_3})$  are smaller than the load and compensation capacitances, one can express the dominant pole as

$$p_1 = -\frac{1}{g_{m_3}g_{m_7}(r_{o_1}||r_{o_2})(r_{o_5}||r_{o_6})(r_{o_7}||r_{o_8})C_{C_1}}$$
(6.17)

and the non-dominant poles as

$$p_{2,3} = -\frac{g_{m_{C_1}}C_L \mp \sqrt{g_{m_{C_1}}^2 C_L^2 - 4C_{p_2}C_L(g_{m_2}g_{m_3} + g_{m_{C_1}}g_{m_8})}}{2C_{p_2}C_L}$$
(6.18)

In addition, the circuit has two complex zeros which are always at higher frequencies than the two complex non-dominant poles. The effect of the high-frequency zeros can be neglected provided that  $C_{C_1} < C_L$ .

The location of the non-dominant poles can be controlled by adjusting the transconductance of the DFC circuit, i.e.,  $g_{m_{C_1}}$ , as shown by Eq. (6.18).

If proper frequency compensation is applied to push the non-dominant poles to sufficiently high frequency, the unity-gain frequency  $f_T$  of the proposed OTA can be approximated as

$$f_T = \frac{g_{m_{b_1}}}{4\pi C_{C_1}}.$$
(6.19)

### **Transistors Sizing**

All the transistors in the proposed OTA operate in the subthreshold (weak inversion) region. The drain current of an MOS transistor in the sub-threshold region [108], [109] is given by

$$I_D = I_0 \frac{W}{L} \exp\left[\frac{|V_{GS}| + (n-1)|V_{BS}|}{nV_T}\right] \left[1 - \exp\left(-\frac{|V_{DS}|}{V_T}\right)\right].$$
 (6.20)

where n is the gate-coupling constant, and  $I_0$  is a process-independent constant.

In the proposed design there is a stack-up of only two transistors. By selecting the biasing of all the intermediate nodes to be at  $V_{DD}/2$ , all gate-driven transistors will have  $|V_{GS}| = |V_{DS}| = V_{DD}/2$  and  $|V_{BS}| = 0$ . By substituting these values in Eq. (6.20) while observing that the second term in the brackets is approximated to unity for transistors

|           | W/L                                  | $I_D$                | $g_m$                 | $g_{ds}$               |
|-----------|--------------------------------------|----------------------|-----------------------|------------------------|
| $M_1$     | $65~\mu{\rm m}/160~{\rm nm}$         | 33.1 $\mu {\rm A}$   | 735 $\mu {\rm A/V}$ * | $32.1 \ \mu A/V$       |
| $M_2$     | $64~\mu{\rm m}/160~{\rm nm}$         | 33.1 $\mu {\rm A}$   | $879~\mu {\rm A/V}$   | $24.0~\mu\mathrm{A/V}$ |
| $M_3$     | $52~\mu\mathrm{m}/160~\mathrm{nm}$   | $28.7~\mu\mathrm{A}$ | 763 $\mu {\rm A/V}$   | $20.9~\mu\mathrm{A/V}$ |
| $M_4$     | $128~\mu\mathrm{m}/160~\mathrm{nm}$  | $28.7~\mu\mathrm{A}$ | $680~\mu {\rm A/V}$   | $29.4~\mu\mathrm{A/V}$ |
| $M_5$     | $52~\mu\mathrm{m}/160~\mathrm{nm}$   | $28.7~\mu\mathrm{A}$ | 763 $\mu {\rm A/V}$   | $20.9~\mu\mathrm{A/V}$ |
| $M_6$     | $128~\mu{\rm m}/160~{\rm nm}$        | $28.7~\mu\mathrm{A}$ | $680~\mu {\rm A/V}$   | $29.4~\mu\mathrm{A/V}$ |
| $M_7$     | 96.6 $\mu\mathrm{m}/160~\mathrm{nm}$ | 56.3 $\mu {\rm A}$   | $1.49 \mathrm{~mA/V}$ | $40.5~\mu\mathrm{A/V}$ |
| $M_8$     | $264~\mu\mathrm{m}/160~\mathrm{nm}$  | 56.3 $\mu A$         | $1.33 \mathrm{~mA/V}$ | 57.6 $\mu {\rm A/V}$   |
| $M_{C_1}$ | $46~\mu{\rm m}/60~{\rm nm}$          | $45.1~\mu\mathrm{A}$ | 1.01  mA/V            | 126 $\mu A/V$          |
| $M_{C_2}$ | 87.4 $\mu {\rm m}/60~{\rm nm}$       | $45.1~\mu\mathrm{A}$ | 906 $\mu {\rm A/V}$   | 169 $\mu {\rm A/V}$    |

Table 6.1: Transistors dimensions and parameters for  $V_{DD} = 0.5$  V

\*  $g_{m_{b_1}} = 109 \ \mu \ A/V$ 

operating in saturation, one can solve for the size aspect ratio  $\left(\frac{W}{L}\right)_k$  of transistors  $M_{2-8}$ and  $M_{C_{1,2}}$  to achieve these bias conditions and find

$$\left(\frac{W}{L}\right)_{k} = \frac{I_{D_{k}}}{I_{0}} \exp\left(-\frac{V_{DD}}{2nV_{T}}\right).$$
(6.21)

On the other hand, the bulk-driven transistors i.e.  $M_1$  will have  $|V_{GS}| = |V_{DS}| = |V_{BS}| = |V_{BS}| = V_{DD}/2$  for an input common-mode level of  $V_{DD}/2$ , and the aspect ratio is given by

$$\left(\frac{W}{L}\right)_1 = \frac{I_{D_1}}{I_0} \exp\left(-\frac{V_{DD}}{2V_T}\right) \tag{6.22}$$

In this technology,  $I_0 = 1.1 \times 10^{-11} A$  for NMOS transistors,  $I_0 = 4.1 \times 10^{-12} A$  for PMOS transistors, and  $n \simeq 1.1$  for both NMOS and PMOS transistors. Further details and circuit set-up to extract these components using the circuit simulator are shown in Appendix C.

The current, transconductance, output conductance, and dimensions of each transistor in the proposed design are listed in Table. 6.1.



Figure 6.18: Chip microphotograph of the fabricated OTA

## 6.1.5 Measurement Results

To confirm the validity of the theoretical results, the OTA circuit with self-biasing of Fig. 6.5(b) was fabricated in a 65 nm general-purpose CMOS technology. The OTA was designed with for a nominal DC gain of 47 dB and a unity-gain frequency of 28 MHz operating off a 0.5 V supply. The transistor dimensions and operating conditions are listed in Table 6.1. The compensation capacitors  $C_{C_1}$  and  $C_{C_2}$  are equal 300 fF. The OTA drives a 3 pF load capacitance. A microphotograph of the fabricated OTA is shown in Fig. 6.18. The OTA occupies an active core area of  $55\mu m \times 90\mu m$ .

Fig. 6.19 shows the measured open-loop frequency-response of the OTA at  $V_{DD} = 0.5$  V. The input and output signals were acquired using active FET probes (Tektronix TAP1500 and TAP2500) and time-domain measurements were carried out using a digital oscilloscope (Tektronix DPO7254). The measured low-frequency gain is 46 dB, the unity-gain frequency is 38 MHz, and the phase margin is 57°.

Fig. 6.20 shows the maximum output voltage swing, where the maximum output swing is defined here as the peak-to-peak output voltage at which the low-frequency gain drops by 3 dB. The maximum output swing is 0.45 V. The measured low-frequency



Figure 6.19: Measured open-loop frequency-response at  $V_{DD} = 0.5$  V

CMRR and PSRR are 35 dB and 37 dB, respectively, for a supply voltage of 0.5 V.

Furthermore, the supply voltage was swept from 0.3 to 1 V, and the low-frequency gain was observed. As shown in Fig. 6.21, the dc gain remains higher than 42 dB for a supply voltage as low as 0.35 V. This is due to the self-biasing mechanism of the circuit. The OTA consumes a total dc power  $P_{dc}$  of 182  $\mu$ W at  $V_{DD} = 0.5$  V, and 17  $\mu$ W at  $V_{DD} = 0.35$  V.



Figure 6.20: Measured DC gain in dB vs. output swing at  $V_{DD} = 0.5$  V



Figure 6.21: Measured DC gain vs. supply-voltage  $V_{DD}$ 

| Parameter        | $V_{DD} =$                  | 0.5 V                       | $V_{DD} = 0.35 \text{ V}$    |                              |  |
|------------------|-----------------------------|-----------------------------|------------------------------|------------------------------|--|
| 1 ai ainetei     | Simulated                   | Measured                    | Simulated                    | Measured                     |  |
| DC gain          | 47  dB                      | 46  dB                      | 41 dB                        | 43  dB                       |  |
| Unity-gain freq. | $28 \mathrm{~MHz}$          | $38 \mathrm{~MHz}$          | $3.4 \mathrm{~MHz}$          | $3.6 \mathrm{~MHz}$          |  |
| Slew rate        | $48 \text{ V}/\mu \text{s}$ | $43 \text{ V}/\mu \text{s}$ | $4.2 \text{ V}/\mu \text{s}$ | $5.6 \text{ V}/\mu \text{s}$ |  |
| Phase margin     | $70^{\circ}$                | $57^{\circ}$                | $76^{\circ}$                 | $56^{\circ}$                 |  |
| Output swing     | $0.37 \mathrm{~V}$          | $0.45 \mathrm{V}$           | $0.24 \mathrm{V}$            | 0.31 V                       |  |
| CMRR             | >30  dB                     | $35 \mathrm{dB}$            | >30  dB                      | 46  dB                       |  |
| PSRR             | >33  dB                     | $37 \mathrm{dB}$            | >33  dB                      | 35  dB                       |  |
| Power            | 146 $\mu W$                 | $182~\mu {\rm W}$           | $13 \ \mu W$                 | $17 \ \mu W$                 |  |

Table 6.2: Comparison between simulated and measured performance of the op-amp

To verify closed-loop operation of the fabricated OTA, a unity-gain configuration set-up was used. A rail-to-rail sinusoidal input was applied. The input and output waveforms are shown in Fig. 6.22 for both  $V_{DD} = 0.5$  V and  $V_{DD} = 0.35$  V. To verify the operation of the OTA at common-mode levels near the supply rails, a small-signal input was applied with input common-mode voltages  $V_{CM}$  of 0.05 V and 0.45 V for  $V_{DD} = 0.5$ . The input and output waveforms are shown in Fig. 6.23. Worst case measured slew rate is 43 V/ $\mu$ s for  $V_{DD} = 0.5$  V and 5.6 V/ $\mu$ s for  $V_{DD} = 0.35$  V. The measured low-frequency CMRR is 35 dB for  $V_{DD} = 0.5$  V and 46 dB for  $V_{DD} = 0.35$  V; whereas the measured low-frequency PSRR is 37 dB for  $V_{DD} = 0.5$  V and 35 dB for  $V_{DD} = 0.35$  V.

Table 6.2 compares the performance of the OTA from simulations with the measured performance at  $V_{DD} = 0.5$  V and at  $V_{DD} = 0.35$  V. The fabricated OTA consumes slightly more power than expected from simulations which results in increase in the unity-gain bandwidth and reduction in the phase margin. This result was not predicted by the simulations. However, by ensuring enough phase margin in the initial design this did not affect the stability of the closed-loop OTA. The reported values of the CMRR and PSRR are the average of four sample chips. The measured values are consistent with the predicted results from simulations under mismatch due to process variations.

Table 6.3 compares the performance of the OTA with other ultra-low voltage OTA designs in the literature. Only ultra-low voltage OTAs with rail-to-rail input-range are considered. The proposed OTA operates from a supply voltage as low as 0.35 V while providing acceptable performance. A Figure of Merit (FoM) that is commonly used to



Figure 6.22: Rail-to-rail input/output waveforms for (a)  $V_{DD} = 0.5$  V (b)  $V_{DD} = 0.35$  V



Figure 6.23: Input/output for  $V_{DD} = 0.5$  V with (a)  $V_{CM} = 0.05$  V (b)  $V_{CM} = 0.45$  V

compare OTAs based on their speed, capacitive driving capability, and power consumption [93] is defined as

$$FoM_1 = 100 \times \frac{f_T \times C_L}{I_{dc}} \tag{6.23}$$

where  $f_T$  is the unity-gain frequency,  $C_L$  is the load capacitance, and  $I_{dc}$  is the current consumption of the OTA.

Another FoM was suggested in [102] to incorporate the threshold-voltage of the technology to reflect the extent of low-voltage operation, and is defined as

$$FoM_2 = 100 \times \frac{f_T \times C_L}{I_{dc}} \times \frac{V_{TH_n} + |V_{TH_p}|}{V_{DD}}$$
 (6.24)

where  $V_{TH_n}$  and  $V_{TH_p}$  are the threshold voltages of the NMOS and PMOS transistors that are used in the design, respectively.

# 6.2 Ultra-low Voltage Bandgap Reference

A bandgap reference (BGR) is a crucial building block in both analog and digital parts of the IC design to provide an accurate, temperature insensitive, and supply independent voltage reference. The use of BGRs is omnipresent in applications that require high level of precision such as power supply regulators, current sources, data converters, and digital memory.

The trend towards smaller transistors and lower supply voltages has led to new techniques in BGR design. Earlier BGRs were used to generate an output voltage of 1.25 V. The minimum supply voltage of a conventional BGR is limited by this value; that is nearly the bandgap of silicon. In order to overcome this limit, several design techniques were developed to generate output voltages that are less than 1.25 V [110]. Most BGRs fabricated in CMOS technology utilize the base-emitter voltage ( $V_{BE}$ ) of diode-connected parasitic vertical bipolar-junction-transistors (BJTs) to generate temperature-dependent voltages or currents. The nominal value of  $V_{BE}$  is around 0.7 V, which constitutes a minimum limit on the supply voltage in BGRs that use parasitic BJTs. BGRs that operate from supply voltages near 0.7 V often eliminate parasitic BJTs to lower the supply voltage limit [111–113].

|                                | JSSC'02 [99]                                  | JSSC'05 [100]                     | TCAS'07 [101]                     | TCAS'14 [102]              | This work                            |                                      |
|--------------------------------|-----------------------------------------------|-----------------------------------|-----------------------------------|----------------------------|--------------------------------------|--------------------------------------|
| Technology                     | CMOS 2.5 $\mu m$                              | CMOS 0.18 $\mu {\rm m}$           | CMOS 0.35 $\mu {\rm m}$           | CMOS 0.13 $\mu {\rm m}$    | CMOS $65 \text{ nm}$                 |                                      |
| Power supply 0.90 V 0.50 V     |                                               | 0.60 V                            | $0.25 \mathrm{V}$                 | $0.50 \mathrm{~V}$         | $0.35 \mathrm{V}$                    |                                      |
| DC gain                        | 70  dB                                        | 52  dB                            | 69  dB                            | 60  dB                     | 46  dB                               | 43  dB                               |
| Unity-gain frequency 5.6 kHz 1 |                                               | $1.2 \mathrm{~MHz}$               | 11.4 kHz                          | 1.88 kHz                   | $38 \mathrm{~MHz}$                   | $3.6 \mathrm{~MHz}$                  |
| Slew rate -                    |                                               | $2.89 \text{ V}/\mu \text{s}$     | 14.6 mV/ $\mu s$                  | $0.64~{\rm mV}/\mu{\rm s}$ | $43 \text{ V}/\mu \text{s}$          | $5.6 \text{ V}/\mu \text{s}$         |
| THD                            | -                                             | 1 %                               | 0.08~%                            | 0.2~%                      | 0.4 %                                | 0.6~%                                |
| Input-refereed noise           | -                                             | $280 \text{ nV}/\sqrt{\text{Hz}}$ | $290 \text{ nV}/\sqrt{\text{Hz}}$ | $3.3 \ \mu V / \sqrt{Hz}$  | 938 nV/ $\sqrt{\rm Hz}$ $^{\dagger}$ | 926 nV/ $\sqrt{\rm Hz}$ $^{\dagger}$ |
| Load capacitance               | 12  pF                                        | 20  pF                            | $15 \mathrm{pF}$                  | $15 \mathrm{ pF}$          | 3 pF                                 |                                      |
| Phase margin                   | 62 °                                          | -                                 | $65~^\circ$                       | $53~^\circ$                | $57~^\circ$                          | 56 °                                 |
| Power consumption              | ower consumption $0.32 \ \mu W$ $110 \ \mu W$ |                                   | $550 \mathrm{~nW}$                | 18 nW                      | $182~\mu {\rm W}$                    | $17 \ \mu W$                         |
| Die area                       | $0.5 \text{ mm}^2$                            | $0.026 \text{ mm}^2$              | $0.06 \text{ mm}^2$               | $0.083 \text{ mm}^2$       | $0.005 \mathrm{~mm^2}$               |                                      |
| $FoM_1$                        | $13.4 \ V^{-1}$                               | $22.7 \ V^{-1}$                   | $18.7 \ V^{-1}$                   | $39.2 \ V^{-1}$            | $31.3 V^{-1}$                        | $22.2 \ V^{-1}$                      |
| $FoM_2$ - 45.4 V <sup>-1</sup> |                                               | $45.4 \text{ V}^{-1}$             | $44.6 \ V^{-1}$                   | $67.4 \text{ V}^{-1}$      | $37.6 V^{-1}$                        | $38.1 \ V^{-1}$                      |

Table 6.3: Comparison of OTA performance with reported ulra-low voltage OTAs

 $^{\dagger}$  simulated



Figure 6.24: The concept of bandgap reference

Here, we present a BGR that operates from a 0.6 V power supply in 65 nm CMOS technology. The design combines several techniques to address the challenges that arise from sub-1V operation.

### 6.2.1 BGR Fundamentals

Traditionally, a BGR generates a temperature independent output voltage by summing two scaled voltages (or currents); one that is proportional to absolute temperature (PTAT), and another that is complementary to absolute temperature (CTAT). The  $V_{BE}$ of a single BJT is often used as the CTAT voltage, while the PTAT voltage is often generated from the difference of the base-emitter voltages  $\Delta V_{BE}$  of two BJTs that are different in area size, where  $\Delta V_{BE} \propto V_T = kT/q$ . The concept is depicted in Fig. 6.24.

A conventional BGR, that is based on the concept illustrated in Fig. 6.24, generates a reference voltage  $V_{REF} = V_{BE} + \beta V_T$ , where  $\beta$  is a scaling factor. At room temperature, the PTAT temperature coefficient  $\partial V_T / \partial T \simeq +0.087 \text{ mV/}^\circ\text{C}$ , while the CTAT temperature coefficient  $\partial V_{BE} / \partial T \simeq -1.5 \text{ mV/}^\circ\text{C}$ . Therefore, to obtain a zero temperature coefficient at room temperature i.e.  $\partial V_{REF} / \partial T \simeq 0$ ,  $\beta$  must be set to 17.2. This results in  $V_{REF} = V_{BE} + 17.2 \times V_T \simeq 1.25$  V, that is the minimum reference voltage obtained from this BGR [105]. Clearly, this topology is not suitable for sub-1V CMOS technologies.

An alternative way to circumvent the silicon bandgap limitation is by summing temperature-dependent currents instead of voltages. Fig. 6.25 shows a BGR that was proposed by Banba *et al* [114] to realize current-mode summation using resistive subdivision. The CTAT current  $I_2$  and the PTAT current  $I_3$  are combined in transistor  $M_2$ and mirrored to transistor  $M_3$  where the temperature-independent current is converted to voltage through the resistor  $R_4$ . The resulting voltage reference is given by

$$V_{REF} = R_4 \left(\frac{V_{BE}}{R_2} + \frac{\Delta V_{BE}}{R_3}\right)$$
(6.25)

where values of resistors  $R_2$  and  $R_3$  can be chosen to nullify the temperature dependence around a certain temperature, whereas  $R_4$  is chosen to scale the voltage to the desired level. Due to this added degree of freedom, this topology allows realizing reference voltages below the limit set by the silicon bandgap i.e. 1.25 V. However, this design suffers from several shortcomings as the voltage supply scales down. From Fig. 6.25, the minimum supply voltage is

$$V_{DD_{min}} = V_{BE_1} + V_{D_{sat}} (6.26)$$

where  $V_{D_{sat}}$  is the overdrive voltage above the transistor saturation level. For  $V_{BE_1} \simeq 0.7$ V and  $V_{D_{sat}} \simeq 0.1$  V,  $V_{DD_{min}} \simeq 0.8$  V which is the minimum supply voltage limit for this topology.

### 6.2.2 The Proposed BGR

In order to allow BGR realization with lower supply voltage, BJTs should be eliminated to avoid the  $V_{BE}$  drop. One possible solution is to replace the BJT with a diode-connected MOS transistor that behaves like a diode and provides a negative temperature coefficient. BGRs that use MOS-only implementations can operate from a supply voltage as low as 0.6 V [111–113]. Although removing the BJTs allows supply voltages below 0.8 V in principle, there are various challenges in the realization of such BGRs. In addition to increased sensitivity to supply and process variations, the design of the BGR op-amp becomes a great challenge. Providing high gain to regulate the loop, achieving wide input range common-mode, and the need for a start-up circuit are all important considerations



Figure 6.25: Low voltage BGR proposed by Banba et al



Figure 6.26: The Proposed BGR

when designing a BGR op-amp at such low supply voltages.

The schematic of the proposed BGR circuit is shown in Fig. 6.26. Diode-connected BJTs in a conventional BGR are replaced with diode-connected NMOS transistors that operate in the sub-threshold region where the gate-to-source voltage  $V_{GS}$  is around the

NMOS threshold voltage  $V_{TH_n}$ . The saturation current of a MOS transistor in the subthreshold region [108] is given by

$$I_D = I_0 \frac{W}{L} \exp\left(\frac{V_{GS}}{nV_T}\right) \tag{6.27}$$

where

$$I_0 = 2\mu_n C_{ox} V_T^2 n \exp\left(\frac{V_{TH_n}}{nV_T}\right) , \qquad (6.28)$$

 $\mu_n$  is the electron mobility,  $C_{ox}$  is the oxide capacitance, and  $n \simeq 1.1$  is the gate-coupling constant.

For  $V_S = 0$ , we can rewrite the gate voltage

$$V_G = nV_T \ln\left(\frac{I_D/I_0}{W/L}\right) . ag{6.29}$$

The threshold voltage of an NMOS transistor has a negative temperature coefficient. Therefore, its temperature dependence can be linearized around temperature  $T_0$  such that

$$V_{TH_n}(T) = V_{TH_n}(T_0) + \frac{\partial V_{TH_n}}{\partial T}(T - T_0)$$
(6.30)

where  $V_{TH_n}(T_0)$  is the threshold voltage of the NMOS transistor at  $T = T_0$ , and  $\partial V_{TH_n}/\partial T$  is the temperature coefficient of the threshold voltage at  $T = T_0$ .

Since the gate of transistor  $M_{N_1}$  is biased near the threshold voltage and the bias current is very low, we can write the temperature-dependent gate-voltage  $V_{G_1}$  as

$$V_{G_1}(T) \simeq V_{TH_n}(T) = V_{TH_n}(T_0) + \frac{\partial V_{TH_n}}{\partial T}(T - T_0) ,$$
 (6.31)

and its the temperature coefficient is

$$\frac{\partial V_{G_1}(T)}{\partial T} \simeq \frac{\partial V_{TH_n}}{\partial T} . \tag{6.32}$$

On the other hand, the difference between the gate voltage of  $M_{N_1}$  and that of  $M_{N_2}$ is given by

$$\Delta V_G = n V_T \ln \left[ \frac{I_{D_1}}{I_{D_2}} \frac{(W/L)_2}{(W/L)_1} \right] , \qquad (6.33)$$

and its temperature coefficient is

$$\frac{\partial \Delta V_G(T)}{\partial T} \simeq n \frac{K}{q} \ln \left[ \frac{I_{D_1}}{I_{D_2}} \frac{(W/L)_2}{(W/L)_1} \right] .$$
(6.34)



Figure 6.27: The low-voltage BGR proposed by Ytterdal

The PTAT current generated by  $\Delta V_G$  is added to the CTAT current generated by  $V_{G_1}$ , and resistor values  $R_2$  and  $R_3$  are used to scale the two quantities. The reference voltage at the output of the BGR is

$$V_{REF} = R_4 \left( \frac{V_{G_1}}{R_2} + \frac{\Delta V_G}{R_3} \right) .$$
 (6.35)

To eliminate temperature dependence around  $T = T_0$ , the condition

$$\frac{1}{R_2}\frac{\partial V_{G_1}(T)}{\partial T} + \frac{1}{R_3}\frac{\partial \Delta V_G(T)}{\partial T} = 0$$
(6.36)

must be satisfied, which requires

$$\frac{R_3}{R_2} = -\frac{K}{q} \frac{n}{\partial V_{TH_n} / \partial T} \ln \left[ \frac{I_{D_1}}{I_{D_2}} \frac{(W/L)_2}{(W/L)_1} \right].$$
(6.37)

## 6.2.3 The Proposed BGR

A similar circuit that uses diode-connected MOS transistors was proposed by Ytterdal in [113] with simulation results to verify its functionality at low supply voltages. The circuit is shown in Fig. 6.27. However, the circuit in [113] suffers from several shortcomings that may affect the robustness of the design. The voltage threshold of the PMOS

|           | W/L                                 |
|-----------|-------------------------------------|
| $M_1$     | $20~\mu\mathrm{m}/800~\mathrm{nm}$  |
| $M_2$     | $14~\mu{\rm m}/800~{\rm nm}$        |
| $M_3$     | $20~\mu{\rm m}/800~{\rm nm}$        |
| $M_4$     | $12~\mu\mathrm{m}/800~\mathrm{nm}$  |
| $M_5$     | $20~\mu{\rm m}/800~{\rm nm}$        |
| $M_6$     | $12~\mu{\rm m}/800~{\rm nm}$        |
| $M_7$     | $22~\mu{\rm m}/800~{\rm nm}$        |
| $M_8$     | $120~\mu\mathrm{m}/800~\mathrm{nm}$ |
| $M_{C_1}$ | $10~\mu{\rm m}/500~{\rm nm}$        |
| $M_{C_2}$ | $50~\mu{\rm m}/500~{\rm nm}$        |

Table 6.4: Transistors dimensions of the OTA used in the BGR design

transistors of the BGR and the op-amp in [113] was reduced by pulling a constant biasing current out of the bulk of the transistor. This technique is called Current-Driven Bulk (CDB) [115]. The bias current in the circuit proposed in [113] is generated from a diode-connected transistor which makes it very sensitive to process and supply variations. Furthermore, the op-amp suggested in [113] for the ultra-low voltage operation uses low- $V_{TH}$  input NMOS transistors to increase the input common-mode range of the op-amp and generate enough gain at low input voltage. However, unless native (or depletion) transistors are used, the gain drops when the input voltage is near 0 V and a start-up circuit is inevitable to ensure operation at the correct bias point. Native, depletion, and sometimes low- $V_{TH}$  transistors are not always available in standard CMOS technologies. In addition, the op-amp suggested in [113] is biased using a simple current source that is derived directly from the raw supply voltage. Most BGR op-amps use a bias current derived from the BGR itself to ensure reliability [116].

In the proposed design shown in Fig. 6.26, the bulk of each of the PMOS transistors  $M_{P_1}, M_{P_2}$  and  $M_{P_3}$  is shorted to its gate to lower the threshold voltage. The MOS transistor that uses this technique is called Dynamic-Threshold MOS (DTMOS) transistor [117]. Using DTMOS transistors provides a robust way to bias the bulk of the PMOS transistor without the need for an extra biasing circuit and allows the PMOS transistor to be on at a very low voltage.

The OTA used in the design of the proposed BGR is based on the architecture proposed in Section 6.1 with different transistor dimensions to reduce power consumption



Figure 6.28: Simulated reference voltage versus temperature

and ensure proper operation. The dimensions of the OTA used in the proposed BGR design are shown in Table 6.4.

## 6.2.4 Simulation Results

Fig. 6.28 shows the simulated reference voltage over temperature range from -40°C to 100°C. The average temperature coefficient is 19  $\mu$ V/°C or 64 ppm/°C. Fig. 6.29 shows the simulated reference voltage versus supply voltage. The proposed BGR provides a stable reference voltage with supply voltage as low as 0.45 V. The power consumption is 66  $\mu$ W at  $V_{DD} = 0.6$  V. and the PSRR at low frequency is 54 dB. The PSRR frequency response is shown in Fig. 6.30. Table 6.5 shows a performance summary of BGR designs in the literature that are capable of operating from a supply voltage near or lower than  $V_{BE}$ .

# 6.3 LDO Voltage Regulator

Finally, we demonstrate the design and implementation of a complete LDO regulator for ultra-low voltage ultra-low power applications with supply voltage as low as 0.65 V in a 65 nm CMOS technology. The design combines several techniques to address the challenges that arise from sub-1V operation. We also discuss the fundamentals of LDO voltage regulators as well as the circuit design theory and details of the designed circuit.



Figure 6.29: Simulated reference voltage versus supply voltage



Figure 6.30: Simulated PSRR of the BGR

## 6.3.1 Fundamentals of LDO voltage regulators

A conventional LDO voltage regulator similar to the one shown in Fig. 6.1 is often designed to have two dominant poles and one dominant LHP zero that is sometimes exploited to enhance stability [118]. Therefore, the open-loop gain of the system can be

| Reference      | CMOS              | $V_{DI}$ | $_{O}(V)$ | $\mathbf{V}$ ( $\mathbf{V}$ ) | Temperature           | TC                | Power     |
|----------------|-------------------|----------|-----------|-------------------------------|-----------------------|-------------------|-----------|
|                | technology        | nom.     | min.      | $V_{REF}$ (V)                 | range ( $^{\circ}C$ ) | $(ppm/^{\circ}C)$ | $(\mu W)$ |
| [111]          | $0.6~\mu{\rm m}$  | 0.8      | 0.8       | 0.356                         | -40 to 120            | 18                | 2         |
| [112]          | SOI               | 1.0      | 0.6       | 0.530                         | 25 to $80$            | $38^{-1}$         | 100       |
| $[113]^2$      | $0.13~\mu{\rm m}$ | 0.6      | 0.55      | 0.400                         | -40 to 100            | 93                | -         |
| This work $^3$ | 65  nm            | 0.6      | 0.45      | 0.300                         | -40 to 100            | 64                | 66        |

Table 6.5: Performance comparison between sub-1V BGR designs in the literature

 $^1$  calculated from reported 0.02 mV/°C over specified temperature range  $^{2,3}$  based on simulation results

given as

$$A_{ol}(s) = A_{EA} A_{M_P} \frac{(1+s/z_1)}{(1+s/p_1)(1+s/p_2)}$$
(6.38)

where  $A_{EA}$  is the gain of the error amplifier, and  $A_{M_P}$  is the gain of the pass-transistor which is given by  $A_{M_P} = g_{m_p} \left[ R_L || r_{o_p} || (R_1 + R_2) \right] \simeq g_{m_p} R_L$  where  $R_L$  represents the load resistance which is directly related to the load current  $I_L$  by  $R_L = V_{OUT}/I_L$ .

The first dominant pole  $p_1$  controlled by the compensation capacitance  $C_C$  is given by

$$p_1 = -\frac{1}{[R_L||r_{o_p}||(R_1 + R_2)]C_C} \simeq -\frac{1}{R_L C_C}$$
(6.39)

The second pole  $p_2$  is often the dominant pole of the error amplifier. Other nondominant poles in the system include the pole caused by the load capacitance  $C_L$  and is given by  $p_3 \simeq -1/(R_{ESR}C_L)$ , the pole caused by the input capacitance at the noninverting terminal of the error amplifier, as well as the non-dominant poles of the error amplifier. Therefore, there is a potential of instability in the system if the non-dominant poles are not placed far beyond the unity gain frequency  $f_U$ . The LHP zero  $z_1$  which is equal to  $-1/(R_{ESR}C_C)$  can be utilized to remedy the situation by adding positive phase shift to the response and enhancing the phase margin. This usually comes at the expense of causing ripples in the transient response of the regulator due to changes in the load current.

A closer look into Eq.(6.38) reveals that both  $R_L$ ,  $A_{M_P}$ , and  $p_1$  are dependent on the load current, and changes in their values cause the open-loop frequency response to vary. Since the load current can range between zero in the no-load condition and maximum current  $I_{Lmax}$ , it is imperative to ensure that the system is stable over the entire range. The first pole is directly proportional to the load current since  $p_1 \simeq -1/(R_L C_C) = -I_L/(V_{OUT}C_C)$ , while  $A_{M_P}$  is proportional to  $1/\sqrt{I_L}$  since  $g_{m_P} \propto \sqrt{I_L}$ and  $R_L \propto 1/I_L$ . Thus, the unity gain frequency, approximated by  $f_U \simeq A_{M_P} p_1$ , increases as the load current increases i.e.  $f_U \propto \sqrt{I_L}$  which causes reduction in the phase margin and potential instability. The LDO regulator must be stable at the worst case condition, that is at the maximum load current. A depiction of the open-loop frequency response of an LDO regulator under minimum and maximum load current conditions are shown in Fig. 6.31.

An LDO regulator filters out power supply noise and fluctuations, effectively shielding



Figure 6.31: Frequency response of LDO regulator for minimum and maximum  ${\cal I}_L$ 



Figure 6.32: Simulated  $V_{OUT}$  vs.  $V_{DD}$ 

the load from supply perturbations. The regulator is desired to achieve this over a wide range of variations in supply voltage, load current, and temperature. The regulator should maintain its performance using minimum quiescent current  $(I_Q)$  and minimum drop-out voltage  $(V_{OUT}-V_{IN})$  to enhance power efficiency. The regulator should also have a high power-supply rejection-ratio (PSRR) to ensure sufficient supply noise isolation. An LDO regulator that has good power supply shielding should have low *line regulation* and high PSRR. Line regulation is defined as the ratio of the change in the output voltage  $\Delta V_{OUT}$  to the change in the supply voltage  $\Delta V_{IN}$  at a specific load current, and is given by

$$\frac{\Delta V_{OUT}}{\Delta V_{IN}} \simeq \frac{g_{m_p} r_{o_p}}{A\beta} + \frac{1}{\beta} \left( \frac{\Delta V_{REF}}{\Delta V_{IN}} \right) \tag{6.40}$$

where  $\beta = R_1/(R_1 + R_2)$ , and  $\Delta V_{REF}/\Delta V_{IN}$  is the supply sensitivity of the BGR circuit that generates  $V_{REF}$ . Line regulation is evaluated in dc by plotting  $V_{OUT}$  versus swept values of  $V_{IN}$ .

PSRR is the voltage gain from  $V_{IN}$  to  $V_{OUT}$ , and is evaluated in ac by plotting the ac gain  $\Delta v_{REF}/\Delta v_{IN}$  over the frequency range of interest [119]. At low frequency, the

PSRR of a regulator is given by

$$PSRR = -\frac{\frac{r_{o_p}||(R_1 + R_2)}{A_{ol}\beta}}{r_{o_p} + \frac{r_{o_p}||(R_1 + R_2)}{A_{ol}\beta}} \simeq -\frac{1}{A_{ol}\beta} .$$
(6.41)

Output voltage insensitivity to variations in the load is described using *load regulation*. Load regulation is defined as the the ratio of the change in the output voltage  $\Delta V_{OUT}$  to the change in the load current $\Delta I_L$ , and is given by

$$\frac{\Delta V_{OUT}}{\Delta I_L} = -\frac{r_{o_p}}{1+A\beta} \ . \tag{6.42}$$

Another important metric to evaluate the regulator insensitivity to load variations is the transient response, and is measured by applying a step load current from  $I_{L_{min}}$  to  $I_{L_{max}}$  and vice versa. Corresponding variations in the output voltage are measured such as rise/fall time, settling time, and overshoot.

Power efficiency of an LDO regulator is defined as

$$\frac{P_{OUT}}{P_{IN}} = \frac{V_{OUT}}{V_{IN}} \frac{I_{OUT}}{I_{IN}} = \frac{V_{OUT}}{V_{IN}} \frac{I_L}{I_L + I_Q}$$
(6.43)

where  $I_Q$  is the quiescent current, which is the current drawn from the supply by the error amplifier and the feedback resistors when the load current is zero. Eq. (6.43) signifies the importance of reducing the drop-out voltage and the quiescent current to improve the regulator efficiency.

### 6.3.2 Simulation Results

A conventional LDO voltage regulator similar to the one shown in Fig. 6.1 was designed in 65 nm CMOS technology to provide a maximum load of 5 mA. The width of the pass-transistor  $M_P$  is set to 2.7 mm and its channel-length to 540 nm. The bulk of the pass-transistor  $M_P$  is shorted to its gate to help lower the threshold voltage and reduce the transistor size. The resistors  $R_1$  and  $R_2$  are set to 41 k $\Omega$  and 49 k $\Omega$ , respectively. The resistor ratio is chosen to yield a 0.55 V at the output from a voltage reference of 0.3 V. The LDO regulator operates from a minimum supply voltage of 0.6 V. The compensation capacitor  $C_C$  is an off-chip capacitor and is set to 5  $\mu$ F. The error amplifier and the BGR circuits are similar to those discusses in Section 6.1 and Section 6.2.

Fig. 6.32 shows the simulated output voltage versus the input voltage (excluding the


Figure 6.34: Simulated PSRR of the LDO voltage regulator

BGR) at  $I_L = I_{L_{max}} = 5mA$ , which results in line regulation of 36 mV/V. Fig. 6.33 shows the simulated output voltage with respect to the load current variation between 1 mA and 5 mA, which results in load regulation of 0.08 mV/mA. The simulated low-frequency PSRR is 47 dB. The PSRR frequency response of the LDO regulator is shown in Fig. 6.34. In in Fig. 6.35, the simulated transient response to load current changing



Figure 6.35: Simulated transient response of the regulator

between 0 mA and 5 mA is shown. The total quiescent current of the voltage regulator is 116  $\mu$ W which is equivalent to nearly 90 % power efficiency.

#### 6.4 Summary

In this chapter, the design of the peripheral circuits needed for the ultra-low-voltage operation in the sub-1 V range was demonstrated. An OTA for ultra-low voltage operation was proposed to tackle the challenges of low voltage operation in modern CMOS technologies. The input stage of the proposed OTA utilizes two low-voltage techniques; namely the pseudo differential pair and the bulk-driven MOS transistor. By combining the two techniques, the proposed OTA simultaneously allows both minimum supply voltage operation and rail-to-rail input common-mode range. The proposed OTA deploys a self-biasing technique that significantly enhances the common-mode and power-supply rejection and also ensures robustness under expected levels of process variations. The proposed self-biasing technique eliminates the need for an extra biasing circuitry, allowing saving in area and power consumption. The enhanced insensitivity to process variations introduced by the self-biasing technique helps increase the technology yield for low-voltage design where the risk of circuit non-functionality is usually high. To verify the theoretical findings, a three-stage OTA for low-voltage application was designed and fabricated in a 65 nm CMOS technology. The proposed OTA provides a gain of 46 dB at a supply voltage of 0.5 V and a gain of 43 dB at a supply voltage of 0.35 V. The measured CMRR is 35 dB at VDD = 0.5 V and 46 dB at VDD = 0.35 V, whereas the measured PSRR is 37 dB at VDD = 0.5 and 35 dB at VDD = 0.35 V. Furthermore,

bandgap reference and voltage regulator that operate from sub-1 V power-supply were designed utilizing the proposed ultra-low-voltage OTA. Simulation results were provided to verify the operation principles of the two circuits.

#### Chapter 7

### **Conclusion and Future Work**

The work presented in this thesis has discussed solutions for two main demands in modern CMOS technology. The first of these demands is the need for wide frequency-range frequency synthesizer phase-locked loops (PLLs), whereas the second demand is the need for ultra-low-voltage circuit blocks that can operate from sub-1 V power supply. That includes PLLs as well as biasing circuitry such as op-amps, bandgap references, and voltage regulators. In addition, the thesis has tackled few other topics related to CMOS circuit design and PLL design such as MOST transistor characterization, loop filter design in PLLs, and behavioral modeling of PLL components.

This chapter is divided into two sections. The first section concludes and summarizes the work presented in this thesis, whereas the second section suggests potential areas of improvement and expansion of this work for the future.

#### 7.1 Summary and Conclusion

An introduction to the scope of the thesis was presented in Chapter 1. The challenges posed by the advance of CMOS technology were discussed. The focus of this work was highlighted by discussing two potential technology demands were emphasized in further details; namely the need for wide tuning-range frequency synthesizers and the downscaling of supply voltage in CMOS technology. The primary contributions and thesis overview were presented in this chapter.

Chapter 2 and 3 serve as background material for the chapters to come. In Chapter 2, we covered several topics related to our top-down approach to PLL design. An overview

of the PLL from a system perspective was presented. The behavior and representation of the main building blocks of the PLL were discussed from a system point of view. Key parameters and definitions to quantify the performance of the PLL were described. Mapping the system-level specifications of the PLL into circuit parameters was briefly discussed. A short introduction to Verilog-A language was presented to enable behavioral modeling of the PLL building blocks in the chapters to come. Finally, a simple tool to characterize the MOS transistor using a set of normalized parameters was developed. This tool was used in the transistor-level design of the CMOS circuits used throughout the thesis.

In Chapter 3, we discussed the design of the individual building blocks of the PLL; namely VCO, PFD/CP, frequency dividers, and loop filter. For each of the VCO, PFD/CP, and frequency dividers, we discussed the principle of operation, non-idealities, performance metrics, and behavioral modeling using Verilog-A. Examples of transistorlevel implementation of each of these blocks were presented and discussed. In another contribution of this thesis, we ended the chapter by discussing the various choices available for the design of the loop filter which directly impact the performance of the PLL. Analytical comparison between different loop filter topologies and orders was presented with detailed qualitative and quantitative analysis.

Chapter 4 presented a complete methodology to model, design, and implement wide tuning-range PLLs using a top-down approach. Mathematical equations that illustrate the contribution of the different sources of noise in the PLL were discussed. Behavioral models that encompass the non-idealities of the PLL components were described using Verilog-A language. The PLL components were designed and noise performance of each component is evaluated using transistor-level simulations. The extracted jitter from the individual blocks was used to find the over-all system noise. The proposed methodology takes into account the variations in the loop dynamics due to changes in the VCO gain and noise, frequency divider ratio, and charge pump current. While optimizing the PLL for maximum tuning-range, the methodology also considers the trade-off between noise, speed, and reference spurs attenuation. The design and implementation of an integer-N frequency synthesizer PLL that covers a continuous frequency range from 156.25 MHz to 10 GHz using a 65 nm CMOS technology was demonstrated. Measurement results to verify the accuracy of the models and validate the predictions made by simulations are provided.

The design of ultra-low-voltage PLLs that operate from sub-1V power supply was discussed in Chapter 5. The design of frequency synthesizer PLLs that can operate in the GHz range while operating from sub-1V supply is a great challenge. The design of the different building blocks of the PLL and the trade-off between the different design choices were discussed in details. The design of a 1-GHz frequency synthesizer PLL that operates from a 0.55 V power supply was presented. The PLL was fabricated in 65 nm CMOS technology and measurement results were presented to verify the design.

The design of an ultra-low-voltage ultra-low-power operational-transconductanceamplifier (OTA) was presented in Chapter 6. The input stage of the proposed OTA utilized a bulk-driven pseudo-differential pair to allow minimum supply voltage while achieving a rail-to-rail input range. All the transistors in the proposed OTA operate in the subthreshold region. Using a novel self-biasing technique to bias the OTA obviated the need for extra biasing circuitry and enhances the performance of the OTA. The proposed technique ensures the OTA robustness to process variations and increases design feasibility under ultra-low-voltage conditions. Moreover, the proposed biasing technique significantly improves the common-mode and power supply rejection of the OTA. To further enhance the bandwidth and allow the use of smaller compensation capacitors, a compensation network based on a damping-factor control circuit was exploited. The OTA was fabricated in a 65 nm CMOS technology. Measurement results show that the OTA can operate at supply voltage as low as 0.35 V. The proposed OTA was used as a building block in the design of ultra-low-voltage bandgap reference and linear-drop-out (LDO) voltage-regulator that operate from a supply voltage as low as 0.6 V.

#### 7.2 Future Work

The design of the wide-range frequency synthesizer PLL can be improved in various ways. The frequency range can be enhanced by using frequency multipliers to generate output frequencies greater than 10 GHz. The frequency step-size can be further reduced using a Delta-Sigma ( $\Delta\Sigma$ ) modulator in the frequency divider. Furthermore, the effect of supply-voltage noise can be reduced if on-chip LDO voltage-regulators are used. This can improve the phase noise of the PLL and reduce the total jitter.

Some modifications can be applied to the design of the ultra-low-voltage PLL. Because of the difficult-to-predict performance of bulk-biased transistors, the measurement results showed that the PLL produced more jitter than predicted. Careful layout should ensure complete isolation of the transistors and the blocks that use bulk-biasing. This can be done by extensively using guard rings and different supplies. On-chip voltage regulators can help further improve the noise performance of the PLL. In addition, for space limitations the loop filter was designed off-chip. Ideally, on-chip loop filter are preferred to ensure full integration of the system.

In the area of ultra-low-voltage building blocks, there remains myriad of possibilities for future work. The field seems very open to many applications that can operate from sub-1 V power supplies such as solar cells and energy harvesting applications. There are different implementations in the literature for ultra-low-voltage circuits and there are still many challenges to overcome. Analog circuits that can benefit from ultra-lowvoltage operation include active filters, oscillators, rectifiers, and PLLs.

As a future work, the novel biasing technique proposed for the design of the ultra-low-voltage OTA in Chapter 6 can be applied to conventional OTA designs. The proposed biasing technique was proved to enhance common-mode rejection, power-supply rejection, and robustness against process variations. The technique was proposed as a solution for ultra-low-voltage operation. Nonetheless, the concept proposed by the biasing technique can be applied to conventional multi-stage OTAs as a method to enhance common-mode rejection and power supply rejection.

# Appendix A LC-VCO Design Parameters

To evaluate the equivalent small-signal parameters of the LC-VCO discussed in Section 3.1.1. and Section 4.2.1, the simulator needs to evaluate  $g_L$ ,  $g_v$ ,  $g_{tank}$ ,  $g_{act}$ , and  $k_{osc}$ . Fig. A.1 shows the circuit simulation set-up used to evaluate these parameters.

Since small-signal parameters are our main interest, ac analysis will be used. The differential output of the VCO is excited by an ac input current and four different zerodc voltage-sources are used as current probes to evaluate the ac current flowing in the different branches. The small-signal parameters are evaluated as follows:

$$g_{act_n} = \frac{i(V_{T1})}{V_{OP} - V_{ON}} \tag{A.1}$$

$$g_{act_p} = \frac{i(V_{T2})}{V_{OP} - V_{ON}} \tag{A.2}$$

$$g_L = \frac{i(V_{T3})}{V_{OP} - V_{ON}}$$
(A.3)

$$g_v = \frac{i(V_{T4})}{V_{OP} - V_{ON}} \tag{A.4}$$

and the following parameters can be easily inferred:

$$g_{act} = g_{act_n} + g_{act_p} \tag{A.5}$$

$$g_{tank} = g_L + g_v \tag{A.6}$$

$$k_{osc} = \frac{g_{act}}{g_{tank}} \tag{A.7}$$



Figure A.1: Simulation set-up to evaluate LC VCO design parameters

## Appendix B

## Calculation of Phase Noise from Time Periods

Timing jitter causes the frequency at the output of the VCO to dither. The VCO behavioral models in Listings 3.3 and 4.1 save the time periods in a ".m" file. Further processing using MATLAB is needed to generate the phase noise profile from the dithered time periods. To ensure enough averaging, more than  $10^7$  periods were often saved. The MATLAB code shown in Listing B.1 uses the *psd* function to compute the phase noise profile from the time periods over the frequency range  $f_{out}/nfft$  to  $f_{out}/2$ , where nfft is the the number of Fast Fourier Transform (FFT) points. Using small nfft limits the frequency range of the computed phase noise profile, whereas a large nfft may result is less uncertainty because of the reduced amount of averaging. The spectrum should finally be divided by the resolution bandwidth rbw used in the calculation.

To speed up the simulations at the system level, the frequency divider noise can be referred to the input. This requires less simulator tolerance time since all jitter disturbances are added at the same transition points. To further enhance the speed of the simulation, the VCO can be combined with the noiseless frequency dividers to generate an output frequency of  $f_{out}/N$  where N is the divider ratio. The reduced maximum frequency of the system allows for faster simulation and smaller step size. If the jitter of a certain building block is too small with respect to the simulator tolerance, the jitter can be scaled up in the behavioral models. In Chapter 4, we referred all in-band noise sources to the input and used a jitter scalar of 100 to multiply with the jitter values of the components to make the jitter more pronounced and reduce the required tolerance. This was accounted for in the MATLAB code by scaling down the noise spectrum.

```
nfft = 2^{1}6; \% should be power of two
winLength=nfft;
overlap=nfft/2;
winNBW=1.5; \% Noise bandwidth given in bins
N=1; % Divider ratio scaling if merged with VCO
% Jitter scale correction if used to relax simulator tolerance time
Jitter_Scale = 1;
% Load the data from the file generated by the VCO
periods=load ('periods.m');
% output estimates of period and jitter
T=mean(periods);
% compute the cumulative phase of each transition
phases=2*pi*cumsum(periods)/T;
% compute power spectral density of phase
[Sphi, f]=psd(phases, nfft, 1/T, winLength, overlap, 'linear');
% correct for FFT window, jitter scale, & divider ratio
Sphi=(N/Jitter_Scale)^2*winNBW*Sphi/nfft;
rbw = winNBW/(T*nfft);
%Remove dc component
\mathbf{K} = \operatorname{length}(\mathbf{f});
f = f(2:K);
Sphi=Sphi(2:K);
% Correct for resolution bandwidth
Sphi=Sphi/rbw;
% plot the results (except at DC)
semilogx(f, 10 * log10(Sphi));
```

Listing B.1: Computation of phase noise from time periods using MATLAB

## Appendix C

# Extraction of MOS Transistors Subthreshold Parameters

The drain current of a PMOS transistor in the sub-threshold region is given by

$$I_{D_n} = I_0 \frac{W}{L} \exp\left[\frac{|V_{GS}| + (n-1)|V_{BS}|}{nV_T}\right] \left[1 - \exp\left(-\frac{|V_{DS}|}{V_T}\right)\right].$$
 (C.1)

where n is the gate-coupling constant, and  $I_0$  is a process-independent constant.

To evaluate the parameters n and  $I_0$  of the PMOS transistor for this particular technology, we used the circuit simulation set-up shown in Fig. C.1. Assuming that  $V_{DS} > 3V_T$ , we rewrite Eq. (C.1) as

$$I_{D_n} = I_0 \frac{W}{L} \exp\left[\frac{|V_{GS}| + (n-1)|V_{BS}|}{nV_T}\right]$$
(C.2)

To evaluate  $I_0$ , the circuit simulation set-up in Fig. C.1(a) is used. The gate, drain, and bulk of the transistor are connected to  $V_{DD}/2$ , whereas the source is grounded. This reduces Eq. (C.2) to

$$I_0 = \frac{I_{D_n}}{\frac{W}{L} \cdot \exp\left(\frac{V_{DD}}{2V_T}\right)}$$
(C.3)

To evaluate n, the circuit simulation set-up in Fig. C.1(b) is used. The gate and drain of the transistor are connected to  $V_{DD}/2$ , whereas the bulk and the source are



Figure C.1: Simulation set-up to extract PMOS transistor subthreshold parameters

grounded. This reduces Eq. (C.2) to

$$n = \frac{V_{DD}}{2V_T \cdot \ln\left(\frac{I_{D_n}}{I_0 \frac{W}{L}}\right)}$$
(C.4)

Similar circuits can be used to extract n and  $I_0$  of NMOS transistors in the subthreshold region.

For the application of Section 6.1.,  $V_{DD} = 0.5$  V, and the channel-length is 160 nm. The extracted parameter *n* for both NMOS and PMOS transistors is 1.1, whereas the extracted  $I_0$  is  $1.12 \times 10^{-11}$  A for NMOS transistors and  $4.10 \times 10^{-12}$  A for PMOS transistors.

### Bibliography

- Y. et al Taur. Cmos scaling into the nanometer regime. Proceedings of the IEEE, 85(4):486–504, 1997.
- [2] L. Benini and G. De Micheli. Networks on chips: a new soc paradigm. Computer, 35(1):70–78, 2002.
- [3] International Technology Roadmap for Semiconductors 2013. http://www.itrs.net. 2013.
- [4] S. Chatterjee, K. Pun, N. Stanic, Y. Tsividis, and P. Kinget. Analog circuit design techniques at 0.5 V. Springer Science and Business Media, 2010.
- [5] R. Muller, S. Gambini, and J. Rabaey. A 0.013 mm2, 5 uw, dc-coupled neural signal acquisition ic with 0.5 v supply. *Solid-State Circuits, IEEE Journal of*, 47(1):232–243, 2012.
- [6] S Kim, J-Y. Lee, S-J. Song, N. Cho, and H-J. Yoo. An energy-efficient analog front-end circuit for a sub-1-v digital hearing aid chip. *Solid-State Circuits, IEEE Journal of*, 41(4):876–882, 2006.
- [7] N. Guilar, T. Kleeburg, A. Chen, and D.; Amirtharajah R. Yankelevich. Integrated Solar Energy Harvesting and Storage. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol.17, no.5, pp.627,637, May 2009.
- [8] D. Sarwana, S.and Kirichenko, V. Dotsenko, A. Kirichenko, S. Kaplan, and D. Gupta. Multi-band digital-rf receiver. *Applied Superconductivity*, *IEEE Transactions on*, 21(3):677–680, 2011.
- [9] J. Mitola. The software radio architecture. Communications Magazine, IEEE, 33(5):26–38, 1995.

- [10] A. Abidi. The path to the software-defined radio receiver. *Solid-State Circuits*, *IEEE Journal of*, 42(5):954–966, 2007.
- [11] S. Roy, J. Foerster, V. Somayazulu, and D. Leeper. Ultrawideband radio design: The promise of high-speed, short-range wireless connectivity. *Proceedings of the IEEE*, 92(2):295–311, 2004.
- [12] B. Razavi. Cognitive radio design challenges and techniques. Solid-State Circuits, IEEE Journal of, 45(8):1542–1553, 2010.
- [13] F. Gardner. *Phase-lock techniques*. John Wiley & Sons, 2005.
- [14] M. Van-Paemel. Analysis of a charge-pump pll: a new model. Communications, IEEE Transactions on, 42(7):2490–2498, 1994.
- [15] H. Rategh and T. H. Lee. Multi-GHz frequency synthesis and division: frequency synthesizer design for 5 GHz wireless LAN systems. Springer Science & Business Media, 2001.
- [16] J. W. Rogers, C. Plett, and F. Dai. Integrated circuit design for high-speed frequency synthesis. Artech House Boston, London, 2006.
- [17] K. Kundert. Predicting the phase noise and jitter of pll-based frequency synthesizers. Available from www. designers-guide. com, 2003.
- [18] K. Kundert. Modeling jitter in pll-based frequency synthesizers. www. designersguide. org, 2006.
- [19] R. Poore. Phase noise and jitter. Agilent EEs of EDA, 2001.
- [20] X. Gao, E. Klumperink, P. Geraedts, and B. Nauta. Jitter analysis and a benchmarking figure-of-merit for phase-locked loops. *Circuits and Systems II: Express Briefs, IEEE Transactions on*, 56(2):117–121, 2009.
- [21] Radio Regulations. International telecommunication union. *Radiocommunication* Sector. ITU-R. Geneva, 2008.
- [22] B. Razavi. *RF microelectronics*, volume 1. Prentice Hall New Jersey, 2011.

- [23] O. Abdelfattah, I. Shih, G. W. Roberts, and Y. Shih. Optimization of lc-vco tuning range under different inductor/varactor losses limitations. In *Electrical and Computer Engineering (CCECE), 2014 IEEE 27th Canadian Conference on*, pages 1–5. IEEE, 2014.
- [24] K. Kundert and O. Zinke. The designers guide to Verilog-AMS. Springer Science & Business Media, 2004.
- [25] BSIM4v4.7 users manual. BSIM Research Group, Department of Electrical Engineering, University of California, Berkeley, http://wwwdevice. eecs.berkeley.edu/ bsim.
- [26] T. Sakurai and A. R. Newton. Alpha-power law mosfet model and its applications to cmos inverter delay and other formulas. *Solid-State Circuits, IEEE Journal of*, 25(2):584–594, 1990.
- [27] Y. Tsividis, K. Suyama, and K. Vavelidis. Simple reconciliation mosfet model valid in all regions. *Electronics letters*, 31(6):506–508, 1995.
- [28] F. Silveira, D. Flandre, and P. Jespers. A gm/id based methodology for the design of cmos analog circuits and its application to the synthesis of a silicon-on-insulator micropower ota. *Solid-State Circuits, IEEE Journal of*, 31(9):1314–1319, 1996.
- [29] F. Klaassen and J. Prins. Thermal noise of mos transistors. *Philips Research Reports*, 22(5):505, 1967.
- [30] A. Scholten, L. Tiemeijer, R. Van Langevelde, R. Havens, A. Zegers-van Duijnhoven, and V. Venezia. Noise modeling for rf cmos circuit simulation. *Electron Devices, IEEE Transactions on*, 50(3):618–632, 2003.
- [31] P. Jespers. The gm/I<sub>D</sub> Methodology, a sizing tool for low-voltage analog CMOS Circuits: The semi-empirical and compact model approaches, volume 29. Springer Science & Business Media, 2009.
- [32] M. del Mar Hershenson, S. Boyd, and T. H. Lee. Gpcad: A tool for cmos opamp synthesis. In Computer-Aided Design, 1998. ICCAD 98. Digest of Technical Papers. 1998 IEEE/ACM International Conference on, pages 296–303. IEEE, 1998.
- [33] J. McNeill and D. Ricketts. *The designer's guide to jitter in ring oscillators*. Springer Science & Business Media, 2009.

- [34] B. Paul. Industrial electronics and control. PHI Learning Pvt. Ltd., 2004.
- [35] A. Abidi and R. Meyer. Noise in relaxation oscillators. *IEEE Journal of Solid-State Circuits*, 18(6):794–802, 1983.
- [36] T. Lee and A. Hajimiri. Oscillator phase noise: a tutorial. Solid-State Circuits, IEEE Journal of, 35(3):326–336, 2000.
- [37] Ali Hajimiri and Thomas H Lee. Design issues in cmos differential lc oscillators. Solid-State Circuits, IEEE Journal of, 34(5):717–724, 1999.
- [38] A. Hajimiri and T. Lee. A general theory of phase noise in electrical oscillators. Solid-State Circuits, IEEE Journal of, 33(2):179–194, 1998.
- [39] S. Li and M. Ismail. A high-performance dynamic-logic phase-frequency detector. Trade-Offs in Analog Circuit Design, pages 821–842, 2002.
- [40] C-H. Hung and O. K.K. A fully integrated 1.5-v 5.5-ghz cmos phase-locked loop. Solid-State Circuits, IEEE Journal of, 37(4):521–525, 2002.
- [41] W. Rhee. Design of high-performance cmos charge pumps in phase-locked loops. In Circuits and Systems, 1999. ISCAS'99. Proceedings of the 1999 IEEE International Symposium on, volume 2, pages 545–548. IEEE, 1999.
- [42] T-N. Luo and Y-J. Chen. A 0.8-mw 55-ghz dual-injection-locked cmos frequency divider. *Microwave Theory and Techniques, IEEE Transactions on*, 56(3):620–625, 2008.
- [43] B. Razavi. A study of injection locking and pulling in oscillators. IEEE Journal of Solid-State Circuits, 39(9):1415–1424, 2004.
- [44] J. Lee and B. Razavi. A 40-ghz frequency divider in 0.18-um cmos technology. *IEEE Journal of Solid State Circuits*, 39(4):594–601, 2004.
- [45] U. Singh and M. Green. High-frequency cml clock dividers in 0.13-um cmos operating up to 38 ghz. Solid-State Circuits, IEEE Journal of, 40(8):1658–1661, 2005.
- [46] C. Vaucher, I. Ferencic, M. Locher, S. Sedvallson, U. Voegeli, and Z. Wang. A family of low-power truly modular programmable dividers in standard 0.35-um cmos technology. *Solid-State Circuits, IEEE Journal of*, 35(7):1039–1045, 2000.

- [47] H-H. Chang and J-C. Wu. A 723-mhz 17.2-mw cmos programmable counter. Solid-State Circuits, IEEE Journal of, 33(10):1572–1575, 1998.
- [48] C. Vaucher and D. Kasperkovitz. A wide-band tuning system for fully integrated satellite receivers. Solid-State Circuits, IEEE Journal of, 33(7):987–997, 1998.
- [49] T. H. Lee. The design of CMOS radio-frequency integrated circuits. Cambridge university press, 2004.
- [50] D. Banerjee. *PLL performance, simulation and design.* Dog Ear Publishing, 2006.
- [51] L. P. Huelsman. Active and Passive Analog Filter Design. New York: McGraw-Hill, 1993.
- [52] C. Y Lau and M. Perrott. Fractional-n frequency synthesizer design at the transfer function level using a direct closed loop realization algorithm. In *Proceedings of* the 40th annual Design Automation Conference, pages 526–531. ACM, 2003.
- [53] S. Aouini, K. Chuai, and G. W. Roberts. Anti-imaging time-mode filter design using a pll structure with transfer function dft. *Circuits and Systems I: Regular Papers, IEEE Transactions on*, 59(1):66–79, 2012.
- [54] B. Wang and E. Ngoya. Integer-n plls verification methodology: Large signal steady state and noise analysis. *Circuits and Systems I: Regular Papers, IEEE Transactions on*, 59(11):2738–2748, 2012.
- [55] Q. Wu, S. Elabd, J. McCue, and W. Khalil. Analytical and experimental study of tuning range limitation in mm-wave cmos lc-vcos. In *Circuits and Systems* (ISCAS), 2013 IEEE International Symposium on, pages 2468–2471. IEEE, 2013.
- [56] P. Andreani and S. Mattisson. On the use of mos varactors in rf vcos. Solid-State Circuits, IEEE Journal of, 35(6):905–910, 2000.
- [57] S. Song and H. Shin. An rf model of the accumulation-mode mos varactor valid in both accumulation and depletion regions. *Electron Devices, IEEE Transactions* on, 50(9):1997–1999, 2003.
- [58] O. Abdelfattah, I. Shih, and G. W. Roberts. A simple analog cmos design tool using transistor dimension-independent parameters. In *Circuits and Systems (ISCAS)*, 2013 IEEE International Symposium on, pages 1067–1070. IEEE, 2013.

- [59] B. Soltanian, H. Ainspan, W. Rhee, D. Friedman, and P. Kinget. An ultra-compact differentially tuned 6-ghz cmos lc-vco with dynamic common-mode feedback. *Solid-State Circuits*, *IEEE Journal of*, 42(8):1635–1641, 2007.
- [60] H. Ainspan and J. Plouchart. A comparison of mos varactors in fully-integrated cmos lc vco's at 5 and 7 ghz. In Solid-State Circuits Conference, 2000. ESS-CIRC'00. Proceedings of the 26rd European, pages 447–450. IEEE, 2000.
- [61] N. Fong, J. Plouchart, N. Zamdmer, D. Liu, L. Wagner, C. Plett, and N. Tarr. Design of wide-band cmos vco for multiband wireless lan applications. *Solid-State Circuits, IEEE Journal of*, 38(8):1333–1342, 2003.
- [62] K. Tang, S. Leung, N. Tieu, P. Schvan, and S. Voinigescu. Frequency scaling and topology comparison of millimeter-wave cmos vcos. In *Compound Semiconductor Integrated Circuit Symposium*, 2006. CSIC 2006. IEEE, pages 55–58. IEEE, 2006.
- [63] G. Y. Tak, S. B. Hyun, T. Kang, B. Choi, and S. Park. A 6.3-9-ghz cmos fast settling pll for mb-ofdm uwb applications. *Solid-State Circuits, IEEE Journal of*, 40(8):1671–1679, 2005.
- [64] J. Mira, T. Divel, S. Ramet, J. Begueret, and Y. Deval. Distributed mos varactor biasing for vco gain equalization in 0.13 μm cmos technology. In *Radio Frequency Integrated Circuits (RFIC) Symposium, 2004. Digest of Papers. 2004 IEEE*, pages 131–134. IEEE, 2004.
- [65] J. Kim, J. Shin, S. Kim, and H. Shin. A wide-band cmos lc vco with linearized coarse tuning characteristics. *Circuits and Systems II: Express Briefs, IEEE Transactions on*, 55(5):399–403, 2008.
- [66] G. Gal, O. Abdelfattah, and G. W. Roberts. A 30-40 ghz fractional-n frequency synthesizer development using a verilog-a high-level design methodology. In *Circuits and Systems (MWSCAS), 2012 IEEE 55th International Midwest Symposium* on, pages 57–60. IEEE, 2012.
- [67] S. Osmany, F. Herzel, and J. Scheytt. A fractional-n synthesizer for softwaredefined radio with reduced level of spurious tones. In *Bipolar/BiCMOS Circuits* and *Technology Meeting (BCTM), 2011 IEEE*, pages 21–24. IEEE, 2011.

- [68] F. Herzel, S. Osmany, K. Schmalz, W. Winkler, J. Scheytt, T. Podrebersek, R. Follmann, and H. Heyer. An integrated 18 ghz fractional-n pll in sige bicmos technology for satellite communications. In *Radio Frequency Integrated Circuits Symposium*, 2009. RFIC 2009. IEEE, pages 329–332. IEEE, 2009.
- [69] C. Zhang and M. Syrzycki. Modifications of a dynamic-logic phase frequency detector for extended detection range. *Circuits and Systems (MWSCAS)*, 2010 53rd *IEEE International Midwest Symposium on*, pages 105–108, 2010.
- [70] Y-S. Choi and D-H. Han. Gain-boosting charge pump for current matching in phase-locked loop. *Circuits and Systems II: Express Briefs, IEEE Transactions* on, 53(10):1022–1025, 2006.
- [71] S-H. Lee and H-J. Park. A cmos high-speed wide-range programmable counter. Circuits and Systems II: Analog and Digital Signal Processing, IEEE Transactions on, 49(9):638–642, 2002.
- [72] T-H. Lin and W. Kaiser. A 900-mhz 2.5-ma cmos frequency synthesizer with an automatic sc tuning loop. *Solid-State Circuits*, *IEEE Journal of*, 36(3):424–431, 2001.
- [73] R. Gregorian. Introduction to CMOS op-amps and comparators. J Wiley & Sons, 1999.
- [74] O. Abdelfattah, I. Shih, and G. W. Roberts. Analytical comparison between passive loop filter topologies for frequency synthesizer plls. In New Circuits and Systems Conference (NEWCAS), 2013 IEEE 11th International, pages 1–4. IEEE, 2013.
- [75] S. Osmany, F. Herzel, and J. Scheytt. An integrated 0.6–4.6 ghz, 5–7 ghz, 10–14 ghz, and 20–28 ghz frequency synthesizer for software-defined radio applications. Solid-State Circuits, IEEE Journal of, 45(9):1657–1668, 2010.
- [76] S-A. Yu, Y. Baeyens, J. Weiner, U-V. Koc, M. Rambaud, F-R. Liao, Y-K. Chen, and P. Kinget. A single-chip 125-mhz to 32-ghz signal source in 0.18-um sige bicmos. *Solid-State Circuits, IEEE Journal of*, 46(3):598–614, 2011.
- [77] C-H. Lee, L. Kabalican, Y. Ge, H. Kwantono, G. Unruh, M. Chambers, and I. Fujimori. A 2.7 ghz to 7 ghz fractional-n lcpll utilizing multimetal layer soc technology in 28nm cmos. In VLSI Circuits Digest of Technical Papers, 2014 Symposium on, pages 1–2. IEEE, 2014.

- [78] M. Caruso, M. Bassi, A. Bevilacqua, and A. Neviani. A 2-16 ghz 65 nm cmos stepped-frequency radar transmitter with harmonic rejection for high-resolution medical imaging applications. *Circuits and Systems I: Regular Papers, IEEE Transactions on*, 62(2):413–422, 2015.
- [79] H-H. Hsieh, C-T. Lu, and L-H. Lu. A 0.5-v 1.9-ghz low-power phase-locked loop in 0.18-um cmos. VLSI Circuits, 2007 IEEE Symposium on, pages 164–165, 2007.
- [80] Y-L. Lo, W-B. Yang, T-S. Chao, and K-H. Cheng. Designing an ultralow-voltage phase-locked loop using a bulk-driven technique. *Circuits and Systems II: Express Briefs, IEEE Transactions on*, 56(5):339–343, 2009.
- [81] K-H. Cheng, Y-C. Tsai, Y-L. Lo, and J-S. Huang. A 0.5-v 0.4-2.24 ghz inductorless phase-locked loop in a system-on-chip. *Circuits and Systems I: Regular Papers*, *IEEE Transactions on*, 58(5):849–859, 2011.
- [82] W-H. Chen, W-F. Loke, and B. Jung. A 0.5-v, 440-uw frequency synthesizer for implantable medical devices. *Solid-State Circuits, IEEE Journal of*, 47(8):1896– 1907, 2012.
- [83] S. Ikeda, T. Kamimura, S. Lee, N. Kanemaru, H. Ito, N. Ishihara, and K. Masu. A 0.5-v 5.5-ghz class-c-vco-based pll with ultra-low-power ilfo in 65 nm cmos. Asian Solid State Circuits Conference (A-SSCC), IEEE, pages 357–360, 2012.
- [84] K. Kwok and H. Luong. Ultra-low-voltage high-performance cmos vcos using transformer feedback. Solid-State Circuits, IEEE Journal of, 40(3):652–660, 2005.
- [85] A. Mazzanti and P. Andreani. Class-c harmonic cmos vcos, with a general result on phase noise. *Solid-State Circuits, IEEE Journal of*, 43(12):2716–2729, 2008.
- [86] K. Okada, Y. Nomiyama, R. Murakami, and A. Matsuzawa. A 0.114-mw dualconduction class-c cmos vco with 0.2-v power supply. In VLSI Circuits, 2009 Symposium on, pages 228–229. IEEE, 2009.
- [87] D. Siprak, M. Tiebout, and P. Baumgartner. Reduction of vco phase noise through forward substrate biasing of switched mosfets. In *Solid-State Circuits Conference*, 2008. ESSCIRC 2008. 34th European, pages 326–329. IEEE, 2008.
- [88] J. Kim, J.-O. Plouchart, N. Zamdmer, R. Trzcinski, K. Wu, B.J. Gross, and M. Kim. A 44-ghz differentially tuned vco with 4ghz tuning range in 0.12 um

soi cmos. In Solid-State Circuits Conference, 2005. Digest of Technical Papers. ISSCC. 2005 IEEE International, pages 416–607, vol. 1. IEEE, 2005.

- [89] Y. Shouli and E. Sanchez-Sinencio. Low voltage analog circuit design techniques: A tutorial. *IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences*, 83(2):179–196, 2000.
- [90] S. Rajput and S. Jamuar. Low voltage analog circuit design techniques. Circuits and Systems Magazine, IEEE, 2(1):24–42, 2002.
- [91] B. Blalock, P. Allen, and G. Rincon-Mora. Designing 1-v op amps using standard digital cmos technology. *Circuits and Systems II: Analog and Digital Signal Processing, IEEE Transactions on*, 45(7):769–780, 1998.
- [92] M. Taherzadeh-Sani and A. Hamoui. A 1-v process-insensitive current-scalable two-stage opamp with enhanced dc gain and settling behavior in 65-nm digital cmos. Solid-State Circuits, IEEE Journal of, 46(3):660–668, 2011.
- [93] L. Zuo and S. Islam. Low-voltage bulk-driven operational amplifier with improved transconductance. *IEEE Trans. Circuits Syst. I, Reg. Papers*, 60(8):2084–2091, 2013.
- [94] J. Fonderie, M. Maris, E. Schnitger, and J. Huijsing. 1-v operational amplifier with rail-to-rail input and output ranges. *Solid-State Circuits, IEEE Journal of*, 24(6):1551–1559, 1989.
- [95] R. Griffith, R. Vyne, R. Dotson, and T. Petty. A 1-v bicmos rail-to-rail amplifier with n-channel depletion mode input stage. *Solid-State Circuits, IEEE Journal of*, 32(12):2012–2022, 1997.
- [96] L. Yao, M. Steyaert, and W. Sansen. A 0.8-v, 8-uw, cmos ota with 50-db gain and 1.2-mhz gbw in 18-pf load. In Solid-State Circuits Conference, 2003. ESSCIRC'03. Proceedings of the 29th European, pages 297–300. IEEE, 2003.
- [97] J. Wang, T-Y. Lee, D-G. Kim, M. Toshimasa, and T. Kenji. Design of a 0.5 v op-amp based on cmos inverter using floating voltage sources. *IEICE transactions* on electronics, 91(8):1375–1378, 2008.

- [98] T-H. Lin, C-K. Wu, and M-C. Tsai. A 0.8-v 0.25-mw current-mirror ota with 160-mhz gbw in 0.18-um cmos. *Circuits and Systems II: Express Briefs, IEEE Transactions on*, 54(2):131–135, 2007.
- [99] T. Stockstad and H. Yoshizawa. A 0.9-v 0.5-ua rail-to-rail cmos operational amplifier. *Solid-State Circuits, IEEE Journal of*, 37(3):286–292, 2002.
- [100] S. Chatterjee, Y. Tsividis, and P. Kinget. 0.5-v analog circuit techniques and their application in ota and filter design. *Solid-State Circuits, IEEE Journal of*, 40(12):2373–2387, 2005.
- [101] L. Ferreira, T. Pimenta, and R. Moreno. An ultra-low-voltage ultra-low-power cmos miller ota with rail-to-rail input/output swing. *Circuits and Systems II: Express Briefs, IEEE Transactions on*, 54(10):843–847, 2007.
- [102] L. Ferreira and S. Sonkusale. A 60-db gain ot operating at 0.25-v power supply in 130-nm digital cmos process. *Circuits and Systems I: Regular Papers, IEEE Transactions on*, 61(6):1609–1617, June 2014.
- [103] F. Rezzi, A. Baschirotto, and R. Castello. A 3 v 12-55 mhz bicmos pseudodifferential continuous-time filter. *Circuits and Systems I: Fundamental Theory* and Applications, IEEE Transactions on, 42(11):896–903, 1995.
- [104] A. Mohieldin, E. Sanchez-Sinencio, and J. Silva-Martinez. A fully balanced pseudodifferential ota with common-mode feedforward and inherent common-mode feedback detector. *Solid-State Circuits*, *IEEE Journal of*, 38(4):663–668, 2003.
- [105] B. Razavi. Design of Analog CMOS Integrated Circuits. McGraw-Hill, 2001.
- [106] K. Leung, P. Mok, W-H. Ki, and J. Sin. Three-stage large capacitive load amplifier with damping-factor-control frequency compensation. *Solid-State Circuits, IEEE Journal of*, 35(2):221–230, 2000.
- [107] A. Grasso, G. Palumbo, and S. Pennisi. Analytical comparison of frequency compensation techniques in three-stage amplifiers. *International Journal of circuit* theory and applications, 36(1):53–80, 2008.
- [108] E. Vittoz and J. Fellrath. Cmos analog integrated circuits based on weak inversion operations. *Solid-State Circuits, IEEE Journal of*, 12(3):224–231, 1977.

- [109] R. Harrison. A wide-linear-range subthreshold cmos transconductor employing the back-gate effect. *Circuits and Systems, 2002. ISCAS 2002. IEEE International* Symposium on, 3:III–727, 2002.
- [110] C. Fayomi, G. Wirth, H. Achigui, and A. Matsuzawa. Sub 1 v cmos bandgap reference design techniques: a survey. Analog Integrated Circuits and Signal Processing, 62(2):141–157, 2010.
- [111] A. Pletersek. A compensated bandgap voltage reference with sub-1-v supply voltage. Analog Integrated Circuits and Signal Processing, 44(1):5–15, 2005.
- [112] M. Ugajin and T. Tsukahara. A 0.6-v voltage reference circuit based on/spl sigma/v/sub th/architecture in cmos/simox. VLSI Circuits, 2001. Digest of Technical Papers. 2001 Symposium on, pages 141–142, 2001.
- [113] T. Ytterdal. Cmos bandgap voltage reference circuit for supply voltages down to 0.6 v. *Electronics Letters*, 39(20):1427–1428, 2003.
- [114] H. Banba, H. Shiga, A. Umezawa, T. Miyaba, T. Tanzawa, S. Atsumi, and K. Sakui. A cmos bandgap reference circuit with sub-1-v operation. *Solid-State Circuits*, *IEEE Journal of*, 34(5):670–674, 1999.
- [115] T. Lehmann and M. Cassia. 1-v power supply cmos cascode amplifier. Solid-State Circuits, IEEE Journal of, 36(7):1082–1086, 2001.
- [116] A. Boni. Op-amps and startup circuits for cmos bandgap references with near 1-v supply. Solid-State Circuits, IEEE Journal of, 37(10):1339–1343, 2002.
- [117] F. Assaderaghi, D. Sinitsky, S. Parke, J. Bokor, P. Ko, and C. Hu. A dynamic threshold voltage mosfet (dtmos) for ultra-low voltage operation. In *Electron De*vices Meeting, 1994. IEDM'94. Technical Digest., International, pages 809–812. IEEE, 1994.
- [118] R. Tantawy and E. Brauer. Performance evaluation of cmos low drop-out voltage regulators. MWSCAS: Midwest symposium on circuits and systems, 1:141–1443, 2004.
- [119] V. Gupta, G. Rincon-Mora, and P. Raha. Analysis and design of monolithic, high psr, linear regulators for soc applications. In SOC Conference, 2004. Proceedings. IEEE International, pages 311–315. IEEE, 2004.