# **NOTE TO USERS**

This reproduction is the best copy available.



# Clock and Data Recovery Circuitry for High Speed Communication Systems

Wen Tsern Ho



Department of Electrical & Computer Engineering McGill University Montreal, Canada

December 2004

A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Master of Engineering.

© 2004 Wen Tsern Ho



Library and Archives Canada

Published Heritage Branch

395 Wellington Street Ottawa ON K1A 0N4 Canada Bibliothèque et Archives Canada

Direction du Patrimoine de l'édition

395, rue Wellington Ottawa ON K1A 0N4 Canada

> Your file Votre référence ISBN: 0-494-12609-4 Our file Notre référence ISBN: 0-494-12609-4

## NOTICE:

The author has granted a nonexclusive license allowing Library and Archives Canada to reproduce, publish, archive, preserve, conserve, communicate to the public by telecommunication or on the Internet, loan, distribute and sell theses worldwide, for commercial or noncommercial purposes, in microform, paper, electronic and/or any other formats.

The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.

## AVIS:

L'auteur a accordé une licence non exclusive permettant à la Bibliothèque et Archives Canada de reproduire, publier, archiver, sauvegarder, conserver, transmettre au public par télécommunication ou par l'Internet, prêter, distribuer et vendre des thèses partout dans le monde, à des fins commerciales ou autres, sur support microforme, papier, électronique et/ou autres formats.

L'auteur conserve la propriété du droit d'auteur et des droits moraux qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.

In compliance with the Canadian Privacy Act some supporting forms may have been removed from this thesis.

While these forms may be included in the document page count, their removal does not represent any loss of content from the thesis.



Conformément à la loi canadienne sur la protection de la vie privée, quelques formulaires secondaires ont été enlevés de cette thèse.

Bien que ces formulaires aient inclus dans la pagination, il n'y aura aucun contenu manquant.

### Abstract

The maturing of the telecommunications industry has seen the development and implementation of devices that work at high frequencies of the electromagnetic spectrum. With the rapid deployment of optical networks, there is an increasing demand for low-cost and efficient communications circuitry. In order to interface with such high frequency signals at lower cost, there has been a recent push for very high frequency circuits using low-cost fabrication technologies like digital CMOS.

This thesis investigates the usage of legacy architectures and the implementation of different topologies using digital CMOS technology. Various Clock and Data Recovery Phase-Locked Loops have been implemented using a  $0.18\mu$ m CMOS technology, and the process from modeling to actual implementation will be presented. The design of the components of the loop, layout issues, and the performance of the various designs will be discussed. New fully-differential CMOS designs that are optimized for high-speed operation, yet providing stable lock with minimal jitter, with a targeted operation range from 1 GHz to 7 GHz, will be described in detail, as well as their operation and optimization.

## Résumé

L'évolution des télécommunications a causé le développement et l'implémentation de composants fonctionnant à hautes fréquences. Avec le développement rapide des réseaux optiques, il y a une demande grandissante pour des circuits efficaces et peu coûteux. Pour réaliser une interface entre de tels circuits, un mouvement vers des technologies de fabrication peu dispendieuses et capables d'opération à hautes fréquences, tel le CMOS numérique, ont vu le jour.

Cette thèse étudie l'usage d'architectures dans une technologie CMOS peu coûteuse. Aussi, des implémentations de différentes topologies CMOS sont réalisées. En ayant recours à un procédé  $0.18\mu$ m CMOS, plusieurs boucles à verrouillage de phase pour récupération d'horloge et d'information ont été implémentées et la méthodologie, en allant de la modélisation  $i_{c} \frac{1}{2}$  l'implémentation, sera présenté. La conception de composantes de la boucle, les détails de disposition et l'analyse de performance des différents circuits seront discutés. Des circuits CMOS complètement différentiels, optimisés pour opération à de 1 GHz à 7 GHz, mais capables de verrouillage avec bruit minimal seront décrits en détail en incluant leur opération et leur optimisation.

#### Acknowledgments

The submission of this thesis would not have been possible without the help of many others in my years of graduate studies. First and foremost, I would especially like to thank my supervisor, Prof. Mourad El-Gamal, for the many years of his patience and fortitude with me. In addition, his advice and help has been invaluable in guiding me to completing this work. Throughout the years, his grasp of the technical aspects of difficult problems has been important in helping me get over the major stumbling blocks in my research. I am grateful for the time worked with him, and wish him success in future years.

I would also like to thank the other students in the Microelectronics group that have helped me over the years, including David Hong, Frederic Nabki, Tommy Tsang, Rola Baki, Koon Lee, Hanzen Zhang, Francis Beaudoin, Alexandre Marsolai, Tang Tat Hung, Mohamed Shaheen, Xiaozhong Zhao, Hyung-Seuk Kim, Mouna Safi-Harb and Peter Levine. The advice of the various people that have been with me over the years as a student helped through the arduous process of finishing my research.

In addition, without the help of the staff at McGill University, this would not have been possible. The computer network administrators, especially Michele Perucic and Carl Jorgensen, and the administrative staff of the department have been very helpful in solving the countless issues that crop up during the years here. The input of various professors that I have encountered over the years in my courses at McGill University, has also been valuable to me.

Last but not least, I want to thank my friends and family that have been with me throughout my life, that have supported me through the years in words and in spirit. I want to thank those closest to me, my friends that I met at McGill University, especially Eugene Lim, Derek Young, Emerson Don and Ryan Lee for standing by me, even through the difficult periods. Finally, my parents Cheok Yuen Ho and Soh Liew Ho, and my sister Wen Li Ho, who were always my source of inspiration and support. Despite the distance that separated my family during my years at McGill University as a foreign student, they had always been there for me at all times. The support of all these people has been what made the completion of this thesis possible.

# Contents

| 1        | Intr | roduction                                | 1  |
|----------|------|------------------------------------------|----|
|          | 1.1  | Background and Technologies              | 1  |
|          | 1.2  | Thesis Overview                          | 3  |
|          |      | 1.2.1 Objectives                         | 3  |
|          |      | 1.2.2 Organization                       | 4  |
|          |      | 1.2.3 Contributions                      | 4  |
|          | 1.3  | General Phase Locked Loop Structure      | 5  |
|          |      | 1.3.1 Introduction                       | 5  |
|          |      | 1.3.2 Phase Locked Loop Topologies       | 7  |
|          |      | 1.3.3 Concept of Clock and Data Recovery | 8  |
| <b>2</b> | Hig  | h-Level Modelling                        | 10 |
|          | 2.1  | Feedback Loop Theory                     | 10 |
|          | 2.2  | First-Order Loops                        | 12 |
|          | 2.3  | Second-Order Loops                       | 14 |
|          | 2.4  | Third-Order Loops                        | 18 |
|          | 2.5  | Charge Pump Based Loops                  | 19 |
|          | 2.6  | CDR Loops                                | 21 |
|          |      | 2.6.1 Random Data Generation             | 22 |
|          | 2.7  | Loop Filter Optimization                 | 22 |
|          |      | 2.7.1 Stability Issues                   | 22 |
|          |      | 2.7.2 Damping Issues                     | 26 |
|          | 2.8  | Non Linearities                          | 26 |
|          | 2.9  | Noise                                    | 27 |

| $C_{1}$    | ale and          | Data Ragovany Loon Components    |
|------------|------------------|----------------------------------|
| 2 1        |                  | Phase Detectors                  |
| 0.1        |                  | Principle of Phase Detection     |
|            | 219              | Mivers                           |
|            | 313              | Sample and Hold                  |
|            | 3.1.5            |                                  |
| 29         | 0.1.4<br>The C   |                                  |
| J.2        |                  | Circle Ended Change Dump         |
|            | 0.2.1<br>2.0.0   | The Differential Charge Pump     |
| 2.2        | J.Z.Z<br>Voltar  | The Differential Charge I timp   |
| 0.0        | vonag<br>vonag   | Ring Oscillatore                 |
|            | 339              | I C Tank Oscillators             |
| 24         | Varaci           | tors                             |
| 0.4        | 2 / 1            | Verector Diodes                  |
|            | 319              | MOS Varactors                    |
|            | 3.4.2            | $N \downarrow N wall Variations$ |
|            | 311              | Varactor Comparison              |
|            | 315              | Modelling Varactors              |
|            | 346              | Three Terminal Varactory         |
| 25         | J.4.0            | Filtare                          |
| 0.0        | 351              | Passive Loon Filters             |
|            | 252              | Active Loop Filters              |
| 36         | Comm             | Active Loop Finters              |
| 0.0<br>9.7 | Uigh (           | Speed Architecture Convideration |
| J.1        | $\frac{11}{271}$ | MOS Current Mode Logic Circuite  |
|            | 0.7.1<br>9.7.9   | VCO Operating Frequency          |
|            | 3.7.2            | VCO Operating Frequency          |

v

|              |        | 4.1.2   | Simulations and Measurements                                    | 53 |
|--------------|--------|---------|-----------------------------------------------------------------|----|
|              | 4.2    | Full-Ra | ate 5.5Gbps Quadrature Quasi-Mixer Implementation               | 56 |
|              |        | 4.2.1   | Design Components                                               | 56 |
|              |        | 4.2.2   | Simulations and Measurements                                    | 62 |
|              | 4.3    | Modifi  | ed Half-Rate 13Gbps Quadrature Quasi-Mixer Clock Recovery (CRC) |    |
|              |        | Implen  | nentation                                                       | 65 |
|              |        | 4.3.1   | Design Components                                               | 65 |
|              |        | 4.3.2   | Simulations and Measurements                                    | 68 |
|              | 4.4    | Compa   | arison of Chip Implementations                                  | 73 |
| 5            | Con    | clusior | 1                                                               | 75 |
|              | 5.1    | Future  | Improvements                                                    | 76 |
| $\mathbf{A}$ | Mat    | lab Co  | ode                                                             | 77 |
|              | A.1    | PRBS    | Generation using LSFR                                           | 77 |
|              | A.2    | Balun   | Calculations                                                    | 80 |
| Re           | eferer | ices    |                                                                 | 81 |

# List of Figures

| 1.1  | Concept of Phase in Periodic Signals                           | 6  |
|------|----------------------------------------------------------------|----|
| 1.2  | Locking of Phase Between Signals                               | 7  |
| 1.3  | Basic Phase Locked-Loop Architecture                           | 7  |
| 1.4  | NRZ data                                                       | 8  |
| 1.5  | Clock and Data Recovery Architecture                           | 9  |
| 2.1  | Closed Loop Transfer Function                                  | 10 |
| 2.2  | Phase Locked Loop Model                                        | 11 |
| 2.3  | Bode Plot for a First-Order PLL                                | 13 |
| 2.4  | Passive Filter Types                                           | 15 |
| 2.5  | Bode Plot for Second-Order PLL                                 | 16 |
| 2.6  | Bode plot with increasing $\zeta$                              | 17 |
| 2.7  | Loop Filter for a Third-order Loop                             | 18 |
| 2.8  | Bode Plot for a Third-Order PLL                                | 19 |
| 2.9  | Charge Pump Based Loop                                         | 20 |
| 2.10 | Simplified Diagram of PFD, Charge Pump, and Loop Filter        | 20 |
| 2.11 | Power Spectrum of NRZ Data of 1 Gb/s                           | 21 |
| 2.12 | 5-Bit LFSR Generator                                           | 22 |
| 2.13 | Typical Feedback Loop                                          | 23 |
| 2.14 | Root Locus Plot of the Open Loop Gain of a Second Order System | 24 |
| 2.15 | Nyquist Plot of the Open Loop Gain of a Second Order System    | 25 |
| 2.16 | Noise Spectrum of the Different Noise Sources                  | 28 |
| 2.17 | Modeling Various Noise Sources in a CDR                        | 29 |
| 2.18 | Output Noise Spectrum of the Different Noise Sources           | 29 |
|      |                                                                |    |

| 3.1  | Basic XOR Gate Phase Detector                                           |
|------|-------------------------------------------------------------------------|
| 3.2  | A Gilbert cell can be Used as a Phase Detector                          |
| 3.3  | Basic Mixer-Based Phase Detector                                        |
| 3.4  | Quadricorrelator Enables CDR Phase Detection                            |
| 3.5  | Operation of a Sample and Hold CDR Phase Detector                       |
| 3.6  | Waveform Diagram of a Sample and Hold Phase Detector                    |
| 3.7  | Circuit Diagram of a Sample and Hold CDR Phase Detector                 |
| 3.8  | Alexander CDR Phase Detector                                            |
| 3.9  | Hogge CDR Phase Detector                                                |
| 3.10 | Single-Ended Charge Pump                                                |
| 3.11 | Differential Charge Pump                                                |
| 3.12 | Differential Ring Oscillator                                            |
| 3.13 | Single Delay Cell of a Ring Oscillator                                  |
| 3.14 | LC Tank Based Oscillator                                                |
| 3.15 | Varactor Diode                                                          |
| 3.16 | PMOS (left) and NMOS (right) Based Varactors                            |
| 3.17 | N+ Nwell Varactor                                                       |
| 3.18 | Tuning Characteristics of VCOs using a Diode Varactor, a PMOS Varactor, |
|      | and an N+ Nwell Varactor                                                |
| 3.19 | Varactors Modelling                                                     |
| 3.20 | Three Terminal Varactor                                                 |
| 3.21 | Passive Loop Filters: Single Ended (left), and Differential (right)     |
| 3.22 | Examples of Active Loop Filters                                         |
| 3.23 | Common Mode Feedback Circuit                                            |
| 3.24 | MOS Current-Mode Logic Inverter                                         |
| 4.1  | Overall Chip Architecture Block Diagram                                 |
| 4.2  | Sample and Hold Phase Detector                                          |
| 4.3  | Voltage-to-Current Charge Pump                                          |
| 4.4  | Delay Cell in VCO Charge Pump                                           |
| 4.5  | Chip Micrograph and Test Board                                          |
| 4.6  | Transient Signal at the Output                                          |
| 4.7  | Spectrum Analyzer Graph of Output Signal                                |

| 4.8  | Phase Noise Measurement of the Output Signal            |
|------|---------------------------------------------------------|
| 4.9  | Quadrature Mixer Half-Rate Architecture                 |
| 4.10 | Phase Detector                                          |
| 4.11 | Frequency Detector                                      |
| 4.12 | Charge Pump                                             |
| 4.13 | Quadrature Voltage Controlled Oscillator                |
| 4.14 | Single-Ended to Differential Converter                  |
| 4.15 | Chip Micrograph                                         |
| 4.16 | Control Signal Lock Simulation                          |
| 4.17 | Transient Signal at the Output                          |
| 4.18 | Spectrum Analyzer Graph of Output Signal                |
| 4.19 | Phase Noise Measurement of the Output Signal            |
| 4.20 | Phase Detector                                          |
| 4.21 | Complementary Quadrature VCO                            |
| 4.22 | Differential Varactor Control in VCO                    |
| 4.23 | VCO Output Buffer                                       |
| 4.24 | Chip Micrograph                                         |
| 4.25 | Control Signal Lock Simulation                          |
| 4.26 | Test PCB and Balun                                      |
| 4.27 | Differential Voltage Control                            |
| 4.28 | Control Using a Single Varactor                         |
| 4.29 | Output Signal of the CRC                                |
| 4.30 | Spectrum Analyzer Graph of the Output Signal of the CRC |
| 4.31 | Phase Noise Measurement for the Output Signal           |
|      | • ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~                 |

# List of Tables

| 1.1 | Operating Frequencies of Various Telecommunications Networks | 2  |
|-----|--------------------------------------------------------------|----|
| 1.2 | Characteristics of High-Speed Wireless Networks [1]          | 3  |
| 2.1 | Parameters and typical values                                | 11 |
| 2.2 | Steady State Response for First Order Loops                  | 14 |
| 2.3 | Steady State Response for Second Order Loops                 | 18 |
| 3.1 | Logic Table for Alexander Detector                           | 35 |
| 3.2 | Summary of Three Terminal Varactor Operation                 | 44 |
| 4.1 | Charge Pump Operation                                        | 59 |
| 4.2 | Effect of Varactor Control on VCO                            | 67 |
| 4.3 | Design Comparison Table                                      | 73 |
| 4.4 | Comparison of to the Literature                              | 74 |
| 4.5 | Performance Summary of the Chip in Section 4.3               | 74 |

## Chapter 1

# Introduction

In this age of modern electronics, the most significant development has been the information revolution of the past few decades. Electronic circuitry is ubiquitous in many countries, and plays a role in many aspects of our lives. It is found in each communication device of the average modern home. Many innovations have been made, to move the information revolution to the current stage today. However, the most important invention that sparked the burst of development in the electronic revolution was the invention of the transistor. One thing led to another, and different circuit designs were used to implement different functions from this basic invention.

Then, with the invention of television in the 1930s, it was mandatory to have a method of synchronizing the scans in cathode ray tubes. This saw the development of the phaselocked loop type of control circuit for devices where clock synchronization was important. These days, this basic building block has found it's way into signal generators, radios, televisions, deep-space receivers and wireless devices. The depth of changes that these inventions have placed in daily life can be staggering, as we seek to adapt to technological changes.

#### 1.1 Background and Technologies

With the advent and growth of the telecommunications industry, and the deployment of large-scale local-area computer networks in recent decades, many tangible benefits from these innovations have touched the lives of many people. This growth has arrived in leaps and bounds since the 1960s, and is a direct result of improvements in the field of integrated circuits. Central to all this technological advancement is the ability of scientists and engineers to pattern complex circuitry on slices of silicon so small and thin, that you need a microscope to view the designs placed on them.

The telecommunications industry has built huge networks that span different transmission mediums, like twisted cable, coaxial, terrestrial radio channels, satellite radio channels and fiber optics. These communication connections rely on the ability to transmit and receive information across a link. Fundamental to these transmitters and receivers is the ability to synchronize and recover data that needs to be transmitted. The clock and data recovery circuits based on phase-locked loop (PLL) architectures are often used as a crucial component to help satisfy this function. Basically, clock and data recovery circuits enable the recovery of the clocking signal and thereafter, the re-timing and recovery of the data signal using signal processing techniques. Clock recovery using PLL techniques enables more robust data regeneration, and reduces the effect of transmission distortion.

 Table 1.1
 Operating Frequencies of Various Telecommunications Networks

| Parameters | Twist                  | ed-Cable       |              | Wireless           |                     |
|------------|------------------------|----------------|--------------|--------------------|---------------------|
| Standard   | $\operatorname{Modem}$ | Ethernet 802.3 | IEEE 802.11b | IEEE 802.11a       | Bluetooth           |
| Speed      | up to 56kbs            | 10 Mbps-1Gbps  | 2.4 GHz      | $5-6 \mathrm{GHz}$ | $2.45 \mathrm{GHz}$ |

Tremendous theoretical and engineering research has been poured into building more effective and higher speed phase-locked loop architectures. With the current demand for high speed links, it is ever more crucial to be able to produce high speed architectures that can satisfy these requirements. Especially due to the push in recent years for the development of wireless networks, it becomes ever more crucial to develop high-speed circuit designs with good power efficiency. For example, Table. 1.1 shows the various standards and their corresponding operating frequencies, and the higher frequency requirements are apparent.

In Table. 1.2, comparisons are made between the different wireless networks standards. It should be noted that the complexity of wireless standards has become ever more demanding and difficult to implement, and the requirement for accurate, efficient, and effective circuitry is more important than ever. The emergence of Bluetooth, 802.11 and HiperLAN2 standards and their realization in actual hardware devices, has brought about a new growth market in the wireless consumer electronics in recent years.

The maturing of wireless technologies, and a highly competitive market, have largely

| Characteristic             | 802.11             | 802.11b             | 802.11a          | HiperLAN2                 |  |
|----------------------------|--------------------|---------------------|------------------|---------------------------|--|
| Spectrum                   | $2.4~\mathrm{GHz}$ | $2.4  \mathrm{GHz}$ | $5~\mathrm{GHz}$ | 5 GHz                     |  |
| Max physical rate          | 2 Mb/s             | 11 Mbit/s           | 54  Mb/s         | 54  Mb/s                  |  |
| Max data rate, layer 3     | 1.2  Mb/s          | 5  Mb/s             | 32  Mb/s         | 32 Mb/s                   |  |
| Medium access control      | Carrier sense      | CSMA/CA             | ·                | Central resource control  |  |
|                            |                    | ,                   |                  | TDMA/TDD                  |  |
| Connectivity               | Connless           | Connless            | Connless         | Connorientated            |  |
| Multicast                  | Yes                | Yes                 | Yes              | Yes                       |  |
| QoS support                | (PCF) *2           | (PCF) *2            | (PCF) *2         | ATM/802.                  |  |
|                            |                    |                     |                  | 1p/RSVP/DiffServ          |  |
| Frequency selection        | Frequency-         | DSSS                | Single Car-      | Single carrier with Dy-   |  |
|                            | hopping/DSSS       |                     | rier             | namic Frequency Selection |  |
| Authentication             | No                 | No                  | No               | NAI/IEEE address/X.509    |  |
| Encryption                 | 40-bit RC4         | 40-bit RC4          | 40-bit RC4       | DES, 3DES                 |  |
| Handover support           | (NO) *3            | (NO) *3             | (NO) *3          | (No) *4                   |  |
| Fixed network support      | Ethernet           | Ethernet            | Ethernet         | Ethernet, IP. ATM,        |  |
|                            |                    |                     |                  | UMTS, FireWire, PPP *5    |  |
| Management                 | 802.11 MIB         | 802.11 MIB          | 802.11 MIB       | HiperLAN2 MIB             |  |
| Radio link quality control | No                 | No                  | No               | Link adaption             |  |

 Table 1.2
 Characteristics of High-Speed Wireless Networks [1]

driven the developmental process for more efficient, lower powered circuits and smaller silicon die size requirements. In addition, giant leaps towards wireless standards in consumer markets will increase the need for high-speed circuitry at lower and affordable costs. In this respect, the greatest cost reduction will come as a result of building entire systems-on-chip, integrated implementations using low-cost CMOS technologies.

### 1.2 Thesis Overview

This thesis investigates the specific application of high-speed phase locked loops in clock and data recovery. The topics of design and specification, and the eventual conception of the actual prototypes in silicon, will be covered here.

#### 1.2.1 Objectives

The theoretical specification of a phase-locked loop has typically assumed ideal operating conditions, and typically models lower frequencies of operation. However, once the envelope of high-speed operation is pushed, circuit parasitics become significant. Design specifications become harder to attain, and device restrictions are forced on the circuit designer. The methods and techniques used to circumvent these barriers will be discussed, and the specifics of high-speed design in CMOS technology will be examined in the context of clock and data recovery.

#### 1.2.2 Organization

This thesis is organized logically by chapters in a manner that approximates typical design flows. Chapter 1 covers a basic introduction into phase-locked loops and clock and data recovery circuits. A background of the subject is given, followed by a discussion of the basics of the design structure.

Design flow typically starts from high-level down to low level, and the next few chapters follow this organization. Firstly, Chapter 2 discusses high-level modelling, and different types of loops will be examined from first to third order loops, as well as the specifics of feedback loop optimization using high-level modelling.

Next, Chapter 3 takes individual component blocks in the loop, and looks at each one in depth. Different designs of each component, with their advantages and disadvantages, and their suitability, are considered for integration in CDR designs. There is an angle of high-speed considerations that will also be covered.

Finally, Chapter 4 examines two actual implementations in silicon, and experimental results will be presented and analyzed. The specifics of the sample-and-hold and quadrature mixer half-rate architectures that have been implemented in a  $0.18\mu$ m CMOS technology are dissected.

#### 1.2.3 Contributions

The general thesis work from conception to final implementation is presented. The contribution to research that this work makes, are given in this section.

- 1. Firstly, the sample-and-hold architecture is tested, and then an adaptation to the quadrature mixer half-rate architecture from bipolar technologies, with improvements for high speed execution is presented. The methodology of this adaptation process to a CMOS technology is presented, with three prototypes implemented on chip. Following the adaptation flow, each prototype is an improvement over the previous one, and they are examined in this framework.
- 2. A full on-chip implementation of a high speed Clock and Data Recovery architecture

in CMOS technology, makes it feasible to consider future entire system-on-chip designs. A novel 13 Gbs architecture implementation is presented. It has the following characteristics.

- The high-speed, quadrature LC-tank VCO has two cross-coupled pairs of complementary NMOS and PMOS. It runs at 7 GHz with a low current consumption of 28 mA.
- The use of a differential varactor configuration to allow full control of the VCO frequency, and a wide tuning range of 1.5GHz.
- A quadrature-based phase detector, making use of a mixer configuration to implement high speed phase detection.
- The use of a frequency detector in a mixer configuration to help the CDR to lock.
- A fully differential charge pump with two sets of differential controls, with highly linear output currents.
- Submitted for publication: Wen Tsern Ho and Mourad N. El-Gamal, "Fully Differential 13 Gbps Clock Recovery Circuit for OC255 SONET," Proc. IEEE International Symposium on Circuits and Systems (ISCAS'05), May 2005.

#### 1.3 General Phase Locked Loop Structure

In this section, the basic topology that forms the basis of clock and data recovery will be discussed. First, general phase locked loop topologies are introduced, followed specifically by the concept of clock and data recovery. As clock and data recovery is a subset of the general PLL structure, it is essential that the basics in PLL structures are well understood before dwelling into the realm of clock and data recovery. Further details of these concepts are introduced by Kroupa [2].

#### 1.3.1 Introduction

Phase locked loops consist of various components placed in a feedback loop for the purpose of locking the phases of two signals together. In essence, this means forcing one signal, that is generated at the output of the loop, to move in tandem with the input signal. Usually, it would be common sense to consider that to synchronize two signals, only the frequency of the two signals needs to be locked together. However, there is also the important consideration of locking the phase of the two signals.



Fig. 1.1 Concept of Phase in Periodic Signals

Signals that are discussed are generally periodic in nature with a specific frequency and period. It is defined that a signal goes through one period in  $2\pi$  radians. The phase of a signal is then defined as, during a particular point in time, at what portion of the cycle the wave is travelling at. If a signal has a phase of  $\phi$  radians, it has shifted  $\phi/2\pi$  from the signal with reference zero phase. This is represented in mathematical form as in Eq. (1.1)

$$x(t) = \sin(\omega t + \phi) \quad . \tag{1.1}$$

The purpose of locking the phase of two signals is illustrated in Fig. 1.2, with the two signals locked at a fixed phase difference of  $\phi$ . Then, the PLL can help accomplish the task of clock regeneration and clock synthesis, for which it is mainly used. By synchronizing the two signals, it is possible to generate a clocking signal that transitions at a fixed phase away from the input signal. The PLL loop measures the phase difference between the input and output signal, and fixes this phase difference. Basically, the PLL loop uses an error signal produced from a measurement of the phase difference between the input and output signal, and minimizes this error signal through the use of a negative feedback mechanism.



Fig. 1.2 Locking of Phase Between Signals

#### 1.3.2 Phase Locked Loop Topologies

There are many different types of PLL structures, but the general concept is usually based on the same basic topology. The most basic abstraction of the PLL architecture is shown in Fig. 1.3, which consists of a phase detector, a loop filter and a voltage-controlled oscillator (VCO). This is a basic loop, which more complex feedback loop designs are commonly based on. Only the most fundmental circuit blocks involved in a basic PLL function are shown, so re-timing circuits and dividers are not shown.



Fig. 1.3 Basic Phase Locked-Loop Architecture

The basic operation of this PLL depends on the feedback loop that allows the VCO to produce a clock signal at a phase and frequency that are matched with the input signal. The phase detector measures the phase difference, and the loop filter converts this measurement into a control signal for the VCO. The feedback in the loop tends to lock the phases of the input signal and the VCO output, by keeping the phase measurement at a constant value.

The phase detector is a circuit that compares the phase of two inputs, and creates a

voltage signal based on the difference. The loop filter, passive or active, dampens this voltage signal, and gives an output voltage to control the VCO. The VCO is a device that gives a stable output clock frequency depending on the input control voltage.

#### 1.3.3 Concept of Clock and Data Recovery



Fig. 1.4 NRZ data

Clock and data recovery (CDR) circuits are a subset of phase locked loop circuits, and have applications in most receivers [3]. A data signal that arrives at a receiver is not a pure clock signal, but consists of pulses (or other wave shapes) of random bits representing data. CDR circuits are used to synchronize an internal clock with this input data signal, in order for the data to be sampled correctly for data recovery.

Typically, the data signals have different encoding formats. Currently, the data fed into a CDR circuit is usually in the Non-Return-to-Zero (NRZ) format as shown in Fig. 1.4. This is in contrast with the Return to Zero (RZ) format that requires the signal to return to zero before representing the next data bit. It should be noted that the CDR has to be able to lock onto the data edges of the NRZ data despite there being a lack of a guaranteed voltage transition in the NRZ data with each clock cycle. This distinction requires stricter design of the phase detector in CDR circuits.

CDR circuits consist of various additional components. and are usually based on PLL architectures, with the same feedback circuitry to ensure stability and accurate lock, and stringent criteria for optimization and design. A typical structure for a CDR circuit would have the structure in Fig. 1.5, which is a typical block diagram for a full clock and data recovery architecture. It shows other other components that are commonly used in CDR circuits. The re-timing circuit is used to re-sample the input data using the locked CDR



Fig. 1.5 Clock and Data Recovery Architecture

clock. The charge pump is a circuit that pushes current or pulls current at the output node depending on the charge pump input signal. The common-mode feedback (CMFB) is a circuit that is used for differential architectures; in this case, the CMFB circuit maintains the average voltage of the differential output signal of the charge pump. This type of topology is definitely more difficult to fully model at the system level.

# Chapter 2

# **High-Level Modelling**

First and foremost in PLL and CDR designs is the stage of high-level modelling. It is essential to have a good understanding of the dynamics of this unique feedback loop, in order to ensure loop stability and correct loop bandwidth optimizations. A discussion is given from first-order to third-order loops, followed by their optimization, and non-linearity and noise considerations. High-level loop modelling is described by Rohde in [4] and [5].

### 2.1 Feedback Loop Theory

The general configuration of a linear feedback system is shown in Fig. 2.1 with A(s) representing the system function of the forward path, and B(s) representing the system function of the feedback path. The input signal and output signal generated by the loop are consid-



Fig. 2.1 Closed Loop Transfer Function

ered as the phase signals  $\phi_{IN}$  and  $\phi_{OUT}$  respectively, and the error signal is given by  $\phi_{ERR}$ ,

which is just the difference between the input and output signal, as given by Eq. (2.1)

$$\phi_{ERR} = \phi_{OUT} - \phi_{IN} \quad . \tag{2.1}$$

By feeding back the error signal  $\phi_{ERR}$ , the feedback loop suppresses the difference between  $\phi_{IN}$  and  $\phi_{OUT}$ . Hence, the loop serves to generate a  $\phi_{OUT}$  as close as possible to  $\phi_{IN}$ . And the overall system closed loop gain is given by H(s) in Eq. (2.2)

$$H(s) = \frac{forward\ gain}{1 + open\ loop\ gain} = \frac{A(s)}{1 + A(s)B(s)} \quad .$$

$$(2.2)$$

A typical phase locked loop is shown in Fig. 2.2, where the modelling parameters of each component are given. Notice that the phase detector produces a voltage signal proportional to the amount of the phase error. The voltage controlled oscillator (VCO) creates the frequency signal that the loop tries to match in terms of phase and frequency, to the input signal.

Also,  $K_D$  represents the gain of the phase detector, F(s) represents the transfer function of the loop filter and  $K_{VCO}$  represents the gain of the VCO.



Fig. 2.2 Phase Locked Loop Model

Typical specifications for a PLL are given in Table. 2.1.

**Table 2.1**Parameters and typical values

|                  | Parameter                     | Units                   | Typical Values                    |
|------------------|-------------------------------|-------------------------|-----------------------------------|
| K <sub>PD</sub>  | Phase Detector Gain           | Volts/Radian            | 2 V/rad                           |
| K <sub>VCO</sub> | VCO gain                      | radians/second per volt | $1 \text{ Grad.} s^{-1}/\text{V}$ |
| F(s)             | Loop filter transfer function | Dimensionless           | Function                          |
| $V_{INPUT}$      | Input Signal                  | Volts                   | 1.8V sine wave at 1 GHz           |
| $R_S$            | Source Impedance              | Ohms                    | $50\Omega$                        |
| $V_{supply}$     | Power Supply                  | Volts                   | 1.8V                              |

From Fig. 2.2, the open loop gain is given by

Open Loop Gain 
$$= A(s)B(s) = \frac{K_D K_{VCO} F(s)}{s}$$
 (2.3)

This would give an overall transfer function

$$H_{overall}(s) = \frac{K_D K_{VCO} F(s)}{s + K_D K_{VCO} F(s)} \quad . \tag{2.4}$$

The system noise bandwidth is defined as

$$H_{noise}(\omega) = \frac{1}{2\pi} \int_0^\infty B(j\omega) d\omega \quad . \tag{2.5}$$

The error function is shown in Eq. (2.6), and is the ratio of the phase error to the input phase

$$H_{error}(s) = \frac{\theta_{error}}{\theta_{input}} = 1 - H_{overall}(s) = \frac{s}{s + K_D K_{VCO} F(s)}$$
(2.6)

The steady-state error is the error when there are no more transient signals. It is found with the final value theorem, which is shown in equation Eq. (2.7)

$$\theta_{error(steady\_state)} = \lim_{t \to \infty} h(t) \cdot \theta_{input} = \lim_{s \to 0} s \cdot H(s) \cdot \theta_{input} = \lim_{s \to 0} \frac{s^2 \cdot \theta_{input}}{s + K_D K_{VCO} F(s)} \quad (2.7)$$

#### 2.2 First-Order Loops

First order loops are just loops without a loop filter, and this would be considered as having the loop filter function, F(s) = 1. This would result in a first-order transfer function, as can be seen from Eq. (2.8) below. Varying the loop filter, as described in subsequent sections, will result in different transfer functions with higher-orders [6]

$$H_{first\_order}(s) = \frac{K_D K_{VCO}}{s + K_D K_{VCO}} \quad . \tag{2.8}$$

A bode plot of a first-order phase locked loop is shown in Fig. 2.3. Notice that there is only one pole, and there is a 20dB/decade drop as the frequency increases, as well as a 90 degrees phase lag.

Without the use of a filter, phase changes in the input signal are not well suppressed

by the loop. Hence, the first order loop has more difficulty in achieving a lock on the input signal. There is a limit to the operating range of the first order loop, and the loop may need to be set with an initial condition close to the locking region, in order for a lock to be possible.



Fig. 2.3 Bode Plot for a First-Order PLL

In fact, the maximum frequency deviation from the VCO natural frequency at which locking is still feasible is termed the capture range. The capture range of a first order loop is

Capture range in 
$$rad/s = K_D K_{VCO}$$
 . (2.9)

In order to achieve proper lock for first order loops without the use of a filter, high loop gain is desirable. But at the same time, high loop gains will render the loop more unstable, and more susceptible to noise. These limitations render the first-order loop rather unusable, and confined to academic discussion.

Despite the impractical use of first order loops, various types of analysis can still be

made about it, which would be useful for further study of phase locked loops. For example, the 3dB loop noise bandwidth for first-order loops with  $|H_{noise}(\omega)| = \frac{1}{\sqrt{2}}$  can be determined using Eq. (2.5) and simplified to

$$H_{(3dB\ Noise\ Bandwidth)} = \frac{K_D K_{VCO}}{4} \quad . \tag{2.10}$$

Examining the error function in response to various inputs is also useful. Using Eq. (2.7), with F(s) = 1 for a first-order loop, we get the following

$$\theta_{error(steady\_state)} = \lim_{s \to 0} \frac{s^2 \cdot \theta_{input}}{s + K_D K_{VCO}} \quad . \tag{2.11}$$

Using Eq. (2.11), we can make a table of the following useful results, summarizing the effect of different inputs to the loop. The effects of a phase step, phase ramp, frequency ramp are presented.

 Table 2.2
 Steady State Response for First Order Loops

| Input Type        | $\theta_{input}$                       | $\theta_{error(steady_state)}$       |
|-------------------|----------------------------------------|--------------------------------------|
| Step in Phase     | $\frac{\Delta \theta}{s}$              | 0                                    |
| Ramp in Phase     | $\frac{\Delta\omega}{s^2}$             | $rac{	riangle \omega}{K_D K_{VCO}}$ |
| Ramp in Frequency | $\frac{\bigtriangleup \omega/dt}{s^3}$ | $\infty$                             |

As can be seen from the table summary of steady state errors of first-order loops, the first-order loop does not respond well to some changes in the input signal. If the first-order topology is used, it will require an initial condition to force the loop to start locking at a frequency that is close to the input signal. As such, the first-order loop is not really useful.

#### 2.3 Second-Order Loops

This section follows with a discussion of second order loops using passive filters. Most PLL designers place filters in the PLL in order to introduce a second pole, that will aid in the locking of the PLL. In general, the introduction of a passive filter will have a transfer function of F(s)

$$F(s) = \frac{1 + s\tau_1}{1 + s\tau_2} \quad . \tag{2.12}$$

Note that if  $\tau_2$  is very large, we will have a pole close to zero frequency as  $s \to 0$ , and F(s) can be approximated by

If 
$$\tau_2$$
 is large,  $F(s) \approx \frac{A}{s}$ , and  $F(0) \to \infty$  . (2.13)

Different types of filters will have different parameters for  $\tau_1$  and  $\tau_2$ . Different example are shown in Fig. 2.4. In certain types of passive filters of second-order loops,  $\tau_1$  can be zero.



Fig. 2.4 Passive Filter Types

Substituting the loop filter transfer function into Eq. (2.4), we get the result in Eq. (2.14). Hence, with the simple passive loop filter of Fig. 2.4, the transfer function becomes of second-order.

Substitute 
$$K = K_D K_{VCO}$$
,  $H_{second\_order}(s) = \frac{K(s\tau_1 + 1)}{s^2 \tau_2 + s(1 + K\tau_1) + K}$  (2.14)

The bode plot for the second-order loop, using the filter on the right of Fig. 2.4, is shown in Fig. 2.5. As the overall transfer function has one zero and two poles, the net effect at  $\infty$  is as if there was a single pole, like a first-order loop. And there is an overall phase change of 90 degrees, and a final 20dB/decade drop at high frequencies.



Fig. 2.5 Bode Plot for Second-Order PLL

In order to simplify the analysis in typical second order transfer analysis, substitutions for the undamped natural frequency  $\omega_n$  and damping ratio  $\zeta$  of the system can be made as follows

Substitute 
$$\omega_n = \sqrt{\frac{K}{\tau_2}}$$
 and  $\zeta = \frac{\omega_n (1 + K\tau_1)}{2K}$ ,  
 $II(s) = \frac{s\frac{K\tau_1}{\tau_2} + \frac{K}{\tau_2}}{s^2 + s\frac{1 + K\tau_1}{\tau_2} + \frac{K}{\tau_2}} = \frac{s(2\zeta\omega_n - \frac{\omega_n^2}{K}) + \omega_n^2}{s^2 + 2\zeta\omega_n s + \omega_n^2}$ . (2.15)

The magnitude of the frequency response is given by

$$|H(s)| = \sqrt{\frac{1 + [(\frac{\omega}{\omega_n})(2\xi - \frac{\omega_n}{K})]^2}{[1 - (\frac{\omega}{\omega_n})^2]^2 + (\frac{2\xi\omega}{\omega_n})^2}}$$
 (2.16)

The phase response is given

$$\measuredangle H(s) = tan^{-1}\left[\left(\frac{\omega}{\omega_n}\right)\left(2\xi - \frac{\omega_n}{K}\right)\right] - tan^{-1}\left[\frac{2\xi\frac{\omega}{\omega_n}}{1 - \left(\frac{\omega}{\omega_n}\right)^2}\right] \quad (2.17)$$

Examining the responses to different inputs, requires the analysis of Eq. (2.7) which can be restated as follows

Since 
$$F(s) = \frac{1 + s\tau_1}{1 + s\tau_2}$$
, and  $\theta_{error(steady-state)} = \lim_{s \to 0} \frac{s^2 \cdot \theta_{input}}{s + K_D K_{VCO} F(s)}$ . (2.18)



**Fig. 2.6** Bode plot with increasing  $\zeta$ 

The plot in Fig. 2.6 shows the effect of varying the value of  $\zeta$  on the bode plot. The similarity to a typical second order graphical analysis can be observed. Typically, in loop optimizations, it is desireable that the loop be critically damped.

Recalling the possible situation in Eq. (2.13) with a pole as  $s \to 0$ , Table. 2.3 can be

constructed.

| Table 2.3 | Steady | State | Response | for | Second | Order | Loops |
|-----------|--------|-------|----------|-----|--------|-------|-------|
|-----------|--------|-------|----------|-----|--------|-------|-------|

| Input Type        | $\theta_{input}$                            | $\theta_{error(steadystate)}$                        | $\theta_{error(steady_state)}$ if $F(s) \approx \frac{A}{s}$ |
|-------------------|---------------------------------------------|------------------------------------------------------|--------------------------------------------------------------|
| Step in Phase     | $\frac{\Delta \theta}{s}$                   | $\frac{s \Delta \theta}{s + K_D K_{VCO} F(\dot{0})}$ | 0                                                            |
| Ramp in Phase     | $\frac{\Delta \omega}{s^2}$                 | $\frac{\Delta\omega}{s+K_DK_{VCO}F(0)}$              | 0                                                            |
| Ramp in Frequency | $rac{{\bigtriangleup \omega / dt}}{{s^3}}$ | $rac{\Delta \omega/dt}{s(s+K_DK_{VCO}F(0))}$        | $rac{	riangle \omega/dt}{K_D K_{VCO} A}$                    |

#### 2.4 Third-Order Loops

The second-order loop has inherent jumps and rippling at the output of the loop filter, and additional filtering is required. A second capacitor is often added to achieve this additional filtering. An example of a typical loop filter for a third-order loop is shown in Fig. 2.7.



Fig. 2.7 Loop Filter for a Third-order Loop

The transfer function for this loop filter is given in Eq. (2.19).

$$F(s) = \frac{1 + sRC}{1 + s(RC + R_1C + R_1C_1) + s^2RR_1CC_1} \quad .$$
(2.19)

The overall transfer function is given by

$$H_{third\_order}(s) = \frac{K_D K_{VCO}(1+sRC)}{K_D K_{VCO} + s(1+K_D K_{VCO} RC) + s^2 (RC + R_1 C + R_1 C_1) + s^3 RR_1 CC_1}$$
(2.20)

A bode plot of the third-order loop transfer function is shown in Fig. 2.8. The overall transfer function has one zero and three poles, and the net effect at  $\infty$  is as if there were two poles. There is an overall phase drop of 180 degrees, and a final 40dB/decade drop at high frequencies.



Fig. 2.8 Bode Plot for a Third-Order PLL

Most loops used in practice will be of the third order, as the introduction of the additional pole gives the designer an added degree of freedom in designing the loop transfer function. Usually, the added pole is placed at a frequency such as to lessen the effect of the zero in the transfer function.

### 2.5 Charge Pump Based Loops

In some cases, a charge pump (CP) may be added to the circuit design. This is primarily used when a phase and frequency detector (PFD) is added to the loop. A diagram of such a modified loop is shown in Fig. 2.9.



Fig. 2.9 Charge Pump Based Loop

Modeling a loop with a charge pump merely entails few differences. A simplistic implementation of the PFD, charge pump, and loop filter is shown in Fig. 2.10.



Fig. 2.10 Simplified Diagram of PFD, Charge Pump, and Loop Filter

In order to model this type of loop, a substitution can be made in the transfer functions discussed in previous sections. Making a substitution into Eq. (2.4), we get the following

Substitute 
$$K_D F(s) = \frac{I_{CP}}{2\pi} Z(s), \ H_{overall}(s) = \frac{\frac{I_{CP}}{2\pi} K_{VCO} Z(s)}{s + \frac{I_{CP}}{2\pi} K_{VCO} Z(s)}$$
 (2.21)

The charge pump provides a charging current  $I_{CP}$ , and Z(s) refers to the impedance of the loop filter. Notice that, in comparison, F(s) refers to the transfer function of the loop filter. In the case of Fig. 2.10, the impedance of the loop filter is given by

$$Z(s) = (R + \frac{1}{sC}) \| (\frac{1}{sC_1}) \quad . \tag{2.22}$$

### 2.6 CDR Loops

Modelling clock and data recovery loops requires the additional step of characterizing the input data. As explained previously, NRZ data is typically received by CDR circuits. The first step involves modelling the spectrum of this input NRZ data. As the input data involves a random bit sequence, it can be represented by the Eq. (2.23) below. The function p(t) represents any arbitrary unit pulse of a data bit. Then, x(t) is a summation of the individual pulses  $p(t - kT_a)$  at each time interval

$$x(t) = \sum_{k} a_{k} p(t - kT_{a}) \text{ where } a_{k} = \pm 1 \text{ and } p(t) = u(t + \frac{T_{a}}{2}) - u(t - \frac{T_{a}}{2}) \quad .$$
 (2.23)

Suppose p(t) is represented by a unit pulse of width  $T_a$ , then the spectrum of the random binary data can be analyzed using the Fourier transform of Eq. (2.23), and is given by Eq. (2.24). A plot of the power spectrum is given in Fig. 2.11.

$$X(f) = \left[\frac{\sin(\pi f T_a)}{\pi f T_a}\right]^2 .$$
 (2.24)



Fig. 2.11 Power Spectrum of NRZ Data of 1 Gb/s

For a square pulse signal with a data rate of 1 Gb/s, the fundamental frequency is 500 MHz. As noted from the power spectrum, the even harmonics of 1 GHz, 2 GHz, 3 GHz and higher are missing. The energy of the data signal is only concentrated in the odd harmonics of 1.5 GHz, 2.5 GHz, 3.5 GHz and higher.

#### 2.6.1 Random Data Generation

For the purpose of testing of actual CDR designs, the generation of random-bit data is required. This is usually accomplished using "pseudo-random" binary sequences (PRBS), and is implemented using linear feedback shift registers (LFSR). An example of a 5-bit LFSR generator is shown in Fig. 2.12. In addition, an example of matlab code that has been used for generating random data for simulation purposes has been included in Appendix A.1.



Fig. 2.12 5-Bit LFSR Generator

## 2.7 Loop Filter Optimization

For the optimization of the phase-locked loop, the loop filter is usually adjusted to take into account stability and damping issues. Stability and damping are used to provide the typical framework for optimizing the loop filter. As loop filter optimization is one of the most important issues in PLL design [7], this section has been devoted to describing this process.

#### 2.7.1 Stability Issues

For stability, the feedback loop shown in Fig. 2.13 has to be checked. A detailed discussion of the stability of a PLL loop can be found in Gardner [8].


Fig. 2.13 Typical Feedback Loop

The open loop gain for Fig. 2.13 is given by

$$G(s) = A(s)B(s)$$
 . (2.25)

The closed loop gain is given by

$$H(s) = \frac{A(s)}{1 + A(s)B(s)} \quad . \tag{2.26}$$

For the stability condition to be satisfied, the following condition for the open loop response must be satisfied at all frequencies.

Stability Condition, 
$$|A(s)B(s)| < 1 \text{ or } \measuredangle H(s) > -\pi$$
 (2.27)

For the purpose of stability discussion, most loop filters can be approximated by a simple single pole system characterized by a time constant  $\tau$ . Then, the open loop gain of the typical PLL is given by

Loop Filter Response 
$$F(s) = \frac{1}{1+s\tau}$$
, Open Loop Gain  $A(s)B(s) = \frac{K_D K_{VCO}}{s(1+s\tau)}$ . (2.28)

The system of Eq. (2.28) has two poles at zero and  $1/\tau$ . As a rough guide for this two pole system, if the open loop frequency response crosses the 0-dB line at -40 dB/decade, the stability condition of Eq. (2.27) would be violated, and instability may occur in the closed feedback loop. In order for the loop to be stable, it is usually required that the open-loop frequency response crosses the 0-dB line with a slope of -20 dB/decade. A consequence of this rule of thumb is the condition for stability as given in Eq. (2.29) below. The loop 3-dB bandwidth is  $K_D K_{VCO}$ , and the loop filter 3-dB bandwidth is approximated as  $1/\tau$  for this system. In addition, the loop filter bandwidth should be designed to be larger than the loop bandwidth.

$$K_D K_{VCO}$$
(Loop Bandwidth) <  $\frac{1}{\tau}$ (Loop Filter Bandwidth) . (2.29)

#### **Root Locus Plots**

To check the absolute stability of the closed loop, root locus techniques and the Nyquist criterion can also be used to ensure that the design is stable. As an example, a root locus diagram and a Nyquist diagram of the open loop gain A(s)B(s) of a second order system have been plotted in Fig. 2.14 and Fig. 2.15.



Fig. 2.14 Root Locus Plot of the Open Loop Gain of a Second Order System

To check the stability using the root locus plot, the root locus of the open loop response (as in Eq. (2.25)) should be plotted. The details of drawing the root locus are omitted here, but plot analysis is described. This plot basically solves for the movement of the roots of Eq. (2.30) below with respect to  $K_{tweak}$ . This equation is the denominator of the closed loop function, and solving for the roots gives the poles of the closed loop function.

$$1 + K_{tweak}A(s)B(s) = 0 . (2.30)$$

Therefore, by plotting the root locus of A(s)B(s), the stability of the closed loop with respect to the change in  $K_{tweak}$ , can be observed. In this case,  $K_{tweak}$  represents an additional gain factor that may be needed to optimize the open loop function, in order to influence the position of the closed loop function poles.

The lines represent the movement of the closed loop poles, and each position on these lines corresponds to a different value of  $K_{tweak}$ . The closed loop is considered stable, as long as all the roots corresponding to one value of  $K_{tweak}$  are all in the left-half-plane of the root locus plot.

As can be observed in the left plot of Fig. 2.14, the loop is stable for all positive values of  $K_{tweak}$ . From the right plot of Fig. 2.14, the closed loop is unstable for increasingly negative  $K_{tweak}$  values, as can be seen from the movement of the line into the right-half-plane.

#### Nyquist Plots

The Nyquist plot in Fig. 2.15 below is also a plot created from the open loop transfer function A(s)B(s). The details of constructing this plot are also omitted here.



Fig. 2.15 Nyquist Plot of the Open Loop Gain of a Second Order System

In order to ensure the stability of the system, it is necessary to ensure that the number of counterclockwise encirclements of the point (-1,0), is equal to the number of right-half-plane poles of the open loop function A(s)B(s).

However, in general, the range of points on the real axis which have the number of counterclockwise encirclements equaling the open-loop right-half-plane poles, corresponds to the range of points  $(-1/K_{tweak}, 0)$  which are stable. The stability of the closed loop function can be modified by changing the open loop transfer function, using  $K_{tweak}$ , in the same manner as in the previous discussion of root locus plots.

In the case of Fig. 2.15, the stable region for the points  $(-1/K_{tweak}, 0)$ , are the points outside the circle.

#### 2.7.2 Damping Issues

Using the second-order equations for PLLs from Eq. (2.15), and approximating that the time constant  $\tau_2$  is significantly greater than  $\tau_1$ , the following equation for the damping ratio can be written

Loop Filter Response 
$$F(s) = \frac{1}{1+s\tau}$$
. Damping Ratio  $\zeta = \frac{1}{2}\sqrt{\frac{1}{K_D K_{VCO}\tau}}$ . (2.31)

Designing the damping ratio to be within  $0.6 < \zeta < 0.8$ , will ensure that the loop will not be too overdamped or underdamped.

## 2.8 Non Linearities

The linear model for the phase detector and the VCO, in the form of the specification of parameters  $K_D$  and  $K_{VCO}$ , may be insufficient for modelling actual circuit implementations.

Phase detectors can just act as sinusoidal functions of the input phase signal. For example, the voltage output of the mixer-type or sinusoidal phase detector can be of the form in Eq. (2.32) below, which is nonlinear in nature. Usually, the analysis of these type of phase detectors involve examining the small signal behavior, and we simply linearize using small-signal approximations.

$$V_{PD} = Asin(\Delta\phi) \quad . \tag{2.32}$$

For other types of non-linear behavior of actual implementations of the phase detector and VCO, higher order term expansions could be used to more accurately model PLL behavior

$$K_{D}(\Delta\phi) = K_{D0} + K_{D1} * \Delta\phi + K_{D2} * \Delta\phi^{2} + \dots$$
  

$$K_{VCO}(\Delta\phi) = K_{V0} + K_{V1} * \Delta\phi + K_{V2} * \Delta\phi^{2} + \dots$$
(2.33)

## 2.9 Noise

Phase noise is an important issue that has to be discussed in the design of phase locked loops [9]. An introduction to phase noise sources and their effects in the PLL will be given. There are many noise sources in a circuit, but the main sources that can be modelled in a simulator are from resistors and transistors. These noise models can be extended to other components and taken into account during noise simulation [10].

There is also noise from the varactors that are used to control the VCO if they are present. The noise of the varactors can be significant depending on the Q factor of the device, and empirical methods may be needed to model this. However, if a varactor diode is used, noise models for diodes can be used.

#### 2.9.1 Noise Sources

Resistors create thermal noise and are modeled with the following Eq. (2.34).

$$i_n^2 = 4kT \frac{1}{R} \Delta f \ [k = 1.38 * 10^{-23} J/K \ (Boltzmann's \ constant)].$$
 (2.34)

Transistors have various noise mechanisms that can be analyzed. Carriers in the transistor CMOS channel move randomly, and generate a noise current modeled by Eq. (2.35) below. The excess noise factor,  $\gamma$ , can vary from 2/3 (long channel) to 3 (short channel NMOS in PMOS substrate) depending on the type of channel. The value of  $\alpha$  is technology dependent, and is used to model the ratio of ideal to non-ideal  $g_m$ . Thermal noise is present regardless of the frequency, and tends to be present as wide-spectrum white noise.

Thermal Noise, 
$$i_n^{\overline{2}} = 4kT\gamma \frac{g_m}{\alpha} \Delta f$$
 . (2.35)

A noise phenomenon called the 1/f noise (also called flicker noise) because of it's inverse dependence on frequency, is observed when carrier traps at the channel below the gate and the gate oxide interface of the transistor capture and release charge carriers in random. It is modeled by Eq. (2.36) below. It should be noted that 1/f noise has an inverse dependence on frequency and tends to be more significant at lower frequencies.

1/f Noise, 
$$i_n^2 = \frac{Kg_m^2}{fWLC_{ox}^2}\Delta f$$
 (2.36)

The gate and source voltages can be changing relative to each other, and this can induce noise in the channel by capacitive coupling. This gives a noise current modeled with Eq. (2.37). This type of noise has a direct squared dependence on frequency, and is more significant at higher frequencies.

Induced Noise, 
$$i_n^2 = 4kT\delta \frac{\alpha}{g_m} (\frac{2\pi C_{gs}f}{\sqrt{5}})^2 \Delta f$$
 . (2.37)

Figure 2.16 places the different transistor noise sources in graphical perspective.



Fig. 2.16 Noise Spectrum of the Different Noise Sources

#### 2.9.2 Noise Modeling in the Loop

Besides modeling the loop noise by a simulation where the noise of each device is considered, the overall loop noise can also be analyzed at the system level.

The noise of each subcircuit in the CDR loop can be considered, and the overall phase noise can be discussed qualitatively in this framework. The phase noise of the CDR input



Fig. 2.17 Modeling Various Noise Sources in a CDR

source  $\Delta \phi_{IN}$ , the inherent noise of the phase detector  $\Delta \phi_{PD}$ , and the phase noise of the VCO  $\Delta \phi_{VCO}$ , are shown as additive noise sources in the overall loop in Fig. 2.17.

The overall transfer function for the phase noise at the output, with the presence of these additive noise sources, is given in Eq. (2.38). A few observations can be made from this equation. The noise of the CDR input source and the phase detector have the same transfer function, and can be treated as added together. These two noise sources are generally transferred towards the output phase noise with a low pass filter function. The VCO phase noise is transferred differently towards the output, and is generally transferred with a high pass filter function.

$$\Delta\phi_{OUT} = \frac{K_D K_{VCO} F(s)}{s + K_D K_{VCO} F(s)} (\Delta\phi_{IN} + \Delta\phi_{PD}) + \frac{s}{s + K_D K_{VCO} F(s)} (\Delta\phi_{VCO}) \quad . \tag{2.38}$$



Fig. 2.18 Output Noise Spectrum of the Different Noise Sources

The typical noise spectrum of these various additive noise sources are shown in the Fig. 2.18. Notice that the phase noise of the CDR input source and the phase detector are plotted on the same graph. Generally, the phase detector introduces wide spectrum white noise. On the other hand, the input source introduces noise in the form of source

feed through, and manifests itself as a frequency spur. The combination of these two is clear in the left-hand graph of Fig. 2.18.

As for the noise contribution from the VCO, the VCO phase noise is highest at lower frequencies and becomes smaller towards higher frequencies. The main noise mechanisms in the VCO are thermal noise and 1/f noise, which are then shaped by the loop, resulting in  $\frac{1}{f^2}$  and  $\frac{1}{f^3}$  noise functions respectively, at the output of the loop.

## Chapter 3

# Clock and Data Recovery Loop Components

In this chapter, the various CDR components are described. Each component needs to fulfill some design requirements that would enable all the components to work together in a closed loop. For the designer, it is still possible that the overall loop may be inoperable after placing all the painstakingly optimized components together. Therefore, it is recommended to plan at the system level, and specify requirements for each individual component, before actually designing the components individually. The following chapter describes the common components of CDR design [11].

## 3.1 CDR Phase Detectors

There are many different types of phase detectors, which use different mechanisms to detect the phase difference between two input signals. In this section, a basic description of phase detectors is given, followed by a specific look at CDR phase detectors.

## 3.1.1 Principle of Phase Detection

A phase detector is a circuit that is able to give an output that indicates the phase difference between its two inputs. There are a few commonly-used types of CDR phase detectors, which are i) the mixer phase detectors, ii) the digital logic phase detectors, and iii) the sample and hold phase detectors. A simple phase detector can consist theoretically of just a single XOR gate, or a more complicated XOR-based phase and frequency detector (PFD) circuit [12].



Fig. 3.1 Basic XOR Gate Phase Detector

The XOR gate detector of Fig. 3.1 produces an output signal that has an average DC level. This average DC level is proportional to the phase difference between the two input phase signals. The use of a loop filter thereafter would generate an average DC level, which then controls the VCO control voltage.

## 3.1.2 Mixers

The XOR gate phase detector of the previous section can be implemented using a Gilbert cell, as shown in Fig. 3.2. Different types of CDR phase detectors can be designed using this basic mixer circuit.



Fig. 3.2 A Gilbert cell can be Used as a Phase Detector

This circuit acts as an XOR gate with digital square input wave signals. However, it behaves more like a mixer when operating at very high frequencies, especially when the input digital signals have a more sinusoidal nature than a square one. The operation of such a mixer is illustrated in Fig. 3.3.

$$V_{2}\cos[\omega_{2}t + \phi_{2}] \downarrow$$

$$V_{1}\sin[\omega_{1}t + \phi_{1}] \longrightarrow V_{out} = 0.5V_{1}V_{2}\sin[(\omega_{1} - \omega_{2})t + (\phi_{1} - \phi_{2})]$$

Fig. 3.3 Basic Mixer-Based Phase Detector

The final output of the mixer phase detector can be filtered by a low pass filter, to give the result in Eq. (3.1), which is proportional to the phase difference of the two input signals

$$V_{LPF}(t) = \frac{V_1 V_2}{2} sin[\phi_1 - \phi_2] \approx \frac{V_1 V_2}{2} [\phi_1 - \phi_2] \quad . \tag{3.1}$$

The mixer itself can be used as a simple phase detector. but should be combined with a mixer-based frequency detector to provide CDR phase detection. An example of such a frequency detector called a quadricorrelator is shown in Fig. 3.4. However, quadrature outputs  $CLK_I$  and  $CLK_Q$  of the VCO clock will be needed for this circuit.



Fig. 3.4 Quadricorrelator Enables CDR Phase Detection

#### 3.1.3 Sample and Hold

Another type of CDR phase detectors involves sample and hold (S&H) circuits. The typical operation of such a circuit is described with the schematic in Fig. 3.5.

The sample and hold CDR phase detector will sample the VCO clock on each NRZ data transition, and produce a zero-order hold output waveform. The waveforms in Fig. 3.6 illustrate the concept of this detector. The voltage level of the S&H output signal reflects



Fig. 3.5 Operation of a Sample and Hold CDR Phase Detector

the phase difference between the VCO clock and the NRZ data transitions. If there is no data transition in a certain bit period, the S&H output signal simply does not change.



Fig. 3.6 Waveform Diagram of a Sample and Hold Phase Detector

The transfer function equation for the sample and hold phase detector can be modelled as follows

$$H_{PD}(s) = \frac{K_{PD}}{\Delta T} \frac{1 - e^{-s\Delta T}}{s} \quad . \tag{3.2}$$

A circuit example of the sample and hold architecture is shown in Fig. 3.7.



Fig. 3.7 Circuit Diagram of a Sample and Hold CDR Phase Detector

#### 3.1.4 Digital Logic

Digital logic is mainly used in lower frequency phase detection, as the clock transitions are less well defined at high frequencies. Alternatively, more robust edge detection techniques have to be applied, to compensate for the higher tolerance requirements for data transitions. However, the techniques used in digital logic phase detectors are very common in many implementations of CDR circuits, as in a recent bang-bang phase detector [13]. Two common architectures are presented here, the Hogge [14] and Alexander CDR [15] phase detectors.

#### The Alexander CDR Phase Detector



Fig. 3.8 Alexander CDR Phase Detector

The Alexander phase detector depends on the detection of the phase position of the clock with respect to the data. The data signal is sampled at three points as illustrated on the right of Fig. 3.8. Depending on the values of the three samples, a logic table can be constructed as in Table. 3.1.

| a | b | с | A | B | PD Out | early-late  |
|---|---|---|---|---|--------|-------------|
| 0 | 0 | 0 | 0 | 0 | 0      | No Decision |
| 0 | 0 | 1 | 1 | 0 | 1      | Late        |
| 0 | 1 | 0 | 1 | 1 | 0      | No Decision |
| 0 | 1 | 1 | 0 | 1 | -1     | Early       |
| 1 | 0 | 0 | 0 | 1 | -1     | Early       |
| 1 | 0 | 1 | 1 | 1 | 0      | No Decision |
| 1 | 1 | 0 | 1 | 0 | 1      | Late        |
| 1 | 1 | 1 | 0 | 0 | 0      | No Decision |

 Table 3.1
 Logic Table for Alexander Detector

This table shows the logic decision of whether the clock is late or early with respect to the data transitions. It should be noted that when there is no data transition, there is no decision in the Alexander detector. The output of the detector can then be passed through a low pass filter as in typical PLLs.

#### The Hogge CDR Phase Detector

The Hogge detector is an architecture that makes use of a simple logic comparison to generate the output, and perform the function of a CDR phase detector.



Fig. 3.9 Hogge CDR Phase Detector

This phase detector, shown in Fig. 3.9, generates two signals A and B. Examining the transition diagram to the right of Fig. 3.9, signal A is a reference signal that has a pulse width directly related to the period of the clock. Signal B is high in between the reference A and the data signal. The difference of A and B is passed through a low pass filter to generate the clock control voltage. When the clock is not locked to the data, the two signals A and B have differing pulse widths, and thus cause overall movement in the clock control voltage. On the other hand, when the clock is locked to the data, the two signals A and B are identical, except A will be delayed compared to B. The difference of signal A and B averages to zero, and this will keep the clock control voltage constant to maintain lock.

## 3.2 The Charge Pump

The concept of the charge pump, and its effect on the transfer function was discussed in the previous chapter. In this section, the circuitry of the charge pump is presented.

#### 3.2.1 Single-Ended Charge Pump

The basic charge pump used in PLL loops is shown in Fig. 3.10. The left half depicts the concept of the charge pump as being composed of switched currents, and the right half shows an implementation using CMOS transistors. The switches affect whether the current source will pull current from, or push current onto, the charge capacitor. The upper half supplies the pull-up current and the lower half supplies the pull-down current.



Fig. 3.10 Single-Ended Charge Pump

The single-ended CMOS charge pump uses a PMOS transistor as the switch for the pull-up current, and an NMOS transistor as the switch for the pull-down current. This arrangement is more efficient and symmetric than using NMOS transistors for both switches, since an NMOS transistor in the upper branch may not switch off the current effectively. The problem with this arrangement is the difficulty in matching the current sources to accommodate process variations, and the problem of supply feedthrough affecting the ground of the capacitor.

Another difficulty is that the control signal for the upper branch has to be a complement of the UP signal. If an inverter is required for this, it will affect the waveform of the UP signal, and unbalance the pull-up branch, as well as introduce a delay that may be significant at higher frequencies. Overall at higher frequencies, there is a problem of balancing the pull-up and pull-down function in the single-ended charge pump.

#### 3.2.2 The Differential Charge Pump

A method to solve the problem of single-ended charge pumps, is to use a fully-differential architecture [16]. With the use of proper layout techniques, the effect of process variation and transistor mismatch on charge pump operation can be minimized. A differential charge pump architecture is shown in Fig. 3.11.



Fig. 3.11 Differential Charge Pump

The differential design ensures that the UP and DOWN signals pull the same current at the output, as long as the control input signals are consistent with each other. The drawback of differential designs is that it is more complicated to implement in the overall PLL architecture.

## 3.3 Voltage Controlled Architectures

The VCO is an essential component of the PLL as it produces the clock signal that has to synchronize with the input data signal. There are usually two types of VCOs used, normally ring oscillators and LC tank oscillators.

## 3.3.1 Ring Oscillators

The principle of a ring oscillator is depicted in Fig. 3.12, with three single-stage differential amplifiers in cascade. The idea is to have any circuit disturbance propagate and amplify through the feedback, creating enough instability in the oscillator feedback to produce a sustained endless oscillation.



Fig. 3.12 Differential Ring Oscillator

In order for oscillation to set in, the total phase shift of the ring oscillator has to be 180°. Therefore, more than three amplifiers in cascade are typically needed in the feedback loop. The frequency of oscillation of the ring oscillator is controlled by adjusting the propagation delay of each amplifier stage. It is for this reason that each amplifier stage can be termed a delay cell. The delay adjustment can be implemented by controlling the current in the amplifier stage, and one such amplifier cell is shown in Fig. 3.13.



Fig. 3.13 Single Delay Cell of a Ring Oscillator

By controlling the voltage of  $V_{Control}$  + and  $V_{Control}$  -, the current flowing through the delay cell can be adjusted. Increasing  $V_{Control}$  + will lead to an increase in current in the delay cell, and a corresponding decrease in the overall delay time. Increasing  $V_{Control}$  - will increase the resistance in the resistive load of the differential amplifier pair, and result in a corresponding increase in the overall delay time. Thus the pair of  $V_{Control}$  + and  $V_{Control}$  - exert differential control over the frequency of the ring oscillator.

#### 3.3.2 LC Tank Oscillators

Another oscillator used in PLLs is the LC tank oscillator. The differential version is more commonly used than the single-ended one because of better supply rejection, and the higher Q factor achievable.



Fig. 3.14 LC Tank Based Oscillator

A simple LC Tank Oscillator in Fig. 3.14 shows the cross-coupled differential pair that provides the negative resistance required for oscillation. The diode-like symbol with the arrow across it represents the varactor. The inductors and varactors form the LC tank of the oscillator and control the resonant frequency of the oscillator. L is the total inductance, whereas C is the total capacitance of the varactors and parasitic capacitances at the oscillating nodes. The oscillation frequency is given in Eq. (3.3), and the values of L and C directly affect the resonant frequency.

$$f_o = \frac{1}{2\pi\sqrt{[LC]_{tank}}} \quad . \tag{3.3}$$

Usually, the value of L is fixed by employing two fixed-value inductors in the oscillator, and the frequency of this oscillator is only adjusted by controlling the varactor control voltage  $V_{Control}$  to adjust the value of C in the tank. Usually, the Q factor of the varactor is not very high, and recent research has focused on improving the Q factor of this device, in order to improve signal purity in the oscillation [17].

## 3.4 Varactors

Varactors are mainly used to control the frequency of oscillation of LC tank oscillators only, and are not used in ring oscillators, because ring oscillators can have their frequencies purely controlled using circuit biasing techniques. There are various types of varactors that are used: i) the varactor diode, ii) MOS varactors, and iii) N+ Nwell varactors. which are described in this section. All of the varactor topologies described throughout Section 3.4 can be implemented in the CMOS process.

#### 3.4.1 Varactor Diodes

A varactor diode is used in the manner depicted in Fig. 3.15, and was used in most early oscillator designs.



Fig. 3.15 Varactor Diode

The voltage input to the n+ doped region is used to control the capacitance at the output of the p+ doped region, and as  $V_{Control}$  increases with respect to  $V_{Output}$ , the capacitance decreases. The diode should not be forward biased, as that will negate the use of the diode as a varactor.

#### 3.4.2 MOS Varactors

The MOS varactors are derived from traditional PMOS and NMOS transistors, with the corresponding varactor configurations shown on the left hand side and right hand side of Fig. 3.16, respectively.



Fig. 3.16 PMOS (left) and NMOS (right) Based Varactors

The drain and source are tied together for both MOS transistor types and they constitute the control terminal  $V_{Control}$  of the device. The output of the varactor at which the LC tank is attached, is the gate of the transistor.

The body terminal of the PMOS transistor has to be biased to Vdd or the rail voltage. For the PMOS varactor,  $V_{Control}$  should be kept below this bulk terminal voltage. In addition, as  $V_{Control}$  increases, the capacitance of the varactor increases.

On the other hand, the NMOS transistor has to be biased to ground. The  $V_{Control}$  has to be kept above this bulk terminal voltage. The behavior is different for the NMOS varactor, as  $V_{Control}$  increases, the capacitance of the varactor decreases.

#### 3.4.3 N+ Nwell Varactors

Another type of varactor is the n+ in Nwell type of varactor, shown in Fig. 3.17.



Fig. 3.17 N+ Nwell Varactor

The nwell is biased by the voltage of the n+ doped regions, which is the  $V_{Control}$  for this varactor. The output is also the gate of this device. This n+ Nwell configuration is derived by eliminating the p+ doped regions of the PMOS transistor. As  $V_{Control}$  increases with respect to  $V_{Output}$ , the capacitance decreases.

#### 3.4.4 Varactor Comparison

Figure 3.18 gives a graphical comparison of the effects of the various varactors discussed previously.



Fig. 3.18 Tuning Characteristics of VCOs using a Diode Varactor, a PMOS Varactor, and an N+ Nwell Varactor

The Diode, PMOS, and N+ Nwell varactors are each used to tune a VCO, and the change in the VCO frequency with respect to the change in control voltage is plotted. A higher capacitance causes a decrease in the frequency of the oscillator. As can be seen in the figure, increasing the control voltage for the n+ nwell varactor and the diode, results in a decrease in capacitance, and hence an increase in the frequency of oscillation. The PMOS varactor has the inverse effect with a change in control voltage. Further information on the characteristics of these varactors is given in [18].

#### 3.4.5 Modelling Varactors

In general, varactors can be modelled with the schematic depicted in Fig. 3.19.



Fig. 3.19 Varactors Modelling

The parameters  $R_P$ ,  $L_P$  and  $C_P$  model the parasitic resistance, parasitic inductance looking into the device, and the parasitic capacitance across the varactor. The parameter  $R_G$  models the resistance looking into the gate, and  $C_v$  and  $R_v$  are used to model the variable nature of the variator. More detailed modelling information can be found in [19].

#### 3.4.6 Three Terminal Varactors

A three terminal varactor is shown in Fig. 3.20, which is a combination of the previous types of varactors.



Fig. 3.20 Three Terminal Varactor

The operation of this three terminal varactor can be summarized in Table. 3.2.

| Capacitance seen at | Effect of increasing voltages of |           |            |  |
|---------------------|----------------------------------|-----------|------------|--|
| Output Node         | Gate                             | Drain(n+) | Source(p+) |  |
| Gate                | Increase                         | Decrease  | Increase   |  |
| Drain(n+)           | Increase                         | Decrease  | Increase   |  |
| Source(p+)          | Increase                         | Decrease  | Increase   |  |

 Table 3.2
 Summary of Three Terminal Varactor Operation

It is possible to use the source (p+) as the output node, and control the capacitance seen at this output node with the gate and drain voltages. As the gate and drain voltages each affect the output capacitance in an opposing manner, this method can be considered a form of differential control of the varactor. For further analysis of this varactor, refer to [20].

## 3.5 Loop Filters

Loop filters are used to filter out high frequency phase changes, and assist in locking the VCO control signal to the required control voltage. There are different types of loop filters, which can basically be classified into passive and active type loop filters.

In general, when comparing different types of loop filters, the biggest distinguishing factor would be whether they are passive or active filters. Active filters give better defined filtering characteristics and better control over the filtering function, but have more complexity as they use active amplifiers. In contrast, passive filters have more design limitations, but have fewer design variations, and are thus easier to design. Another design consideration is the order of the filter, and this is discussed in Section 2.1. There are more types of filter variation described in other works, but they are less crucial for the purpose of CDR design.

#### 3.5.1 Passive Loop Filters

In Fig. 3.21, the left circuit shows a single-ended loop filter, and the right circuit shows a differential loop filter. The effects on the overall transfer function, and modelling the



Fig. 3.21 Passive Loop Filters: Single Ended (left), and Differential (right)

specifics of this particular filter have been covered in the previous chapter on third order loops. In order for the differential loop filter to be balanced, the resistors are divided in two, and placed symmetrically between the differential terminals.

#### 3.5.2 Active Loop Filters

In Fig. 3.22, two examples of possible loop filters for a third-order PLL are shown. These are named active filters because of the addition of amplifiers in them, which usually contains active devices like transistors, and do not just consist of passive components like resistors and capacitors.



Fig. 3.22 Examples of Active Loop Filters

The transfer function for the loop filter on the left of Fig. 3.22 is given by

$$F(s) = \frac{1}{R_1 C_1} \frac{1 + sR_2(C_1 + C_2)}{s(1 + sR_2 C_2)} \quad . \tag{3.4}$$

The transfer function for the loop filter on the right of Fig. 3.22 is given by

$$F(s) = \frac{1}{R_1 C_1} \frac{1 + sR_2 C_1}{s(1 + sR_3 C_2)} \quad . \tag{3.5}$$

## 3.6 Common Mode Feedback Circuits

When using differential circuits, it is often necessary to make use of common mode feedback circuits (CMFB) [21]. The common mode voltage refers to the average of the two voltages of a differential signal.

The reason for using these CMFB circuits is that the overall loop feedback determines the differential voltages, but does not affect the common mode voltage. Therefore, it may be necessary, especially before a differential loop filter, to maintain the common mode level of the filter. These CMFB circuits sense the change in common mode voltage levels, and adjust the DC biasing currents or voltages of the circuits charging the loop filter. An example of a CMFB circuit is given in Fig. 3.23.



Fig. 3.23 Common Mode Feedback Circuit

In this circuit, the feedback mechanism works by trying to balance the currents in each branch, as indicated in Fig. 3.23, to be equal to one unit of I (the biasing current of one current source). By seeking to keep the total of the currents in the transistors directly connected to  $V_{Diff}$  (the differential signal), equal to 2I, this circuit balances the average of the voltage levels of the differential signal. In this sense, it then generates a corresponding change by feedback, in the output signal  $V_{Out}$ , that controls the biasing voltages or currents in the previous stage feeding the CMFB circuit.

Many different types of CMFB circuits exist, but the general concept of averaging the differential signals and feedback is similar to that described previously. However, the analysis of the CMFB circuit is outside the scope of this work, but can be examined in [22].

## 3.7 High-Speed Architecture Considerations

As signal frequencies are ramped to higher frequencies in circuits, the unity-gain frequency  $(f_T)$  of the transistors may not be high enough to provide sufficient bandwidth for signal propagation. As the periods of signals become smaller, the maximum possible rise and fall times of signals become stricter. Slew rate limiting and distortion may set in as the transistors become unable to keep up with ideal behavior. As such, there are some special considerations for high-speed designs [23].

#### 3.7.1 MOS Current-Mode Logic Circuits

In phase detectors, where the highest frequency of switching is required in order to detect data edges and clock transitions, there is a high potential for failure due to excessive device parasitics and speed limitations in the design. A method that has been used to alleviate this situation in CMOS technologies is the use of (MOS Current-Mode Logic) MCML circuits [24]. A MCML inverter is shown in Fig. 3.24. This type of gate design is based on Current Mode Logic (CML), which was originally used in bipolar circuits.



Fig. 3.24 MOS Current-Mode Logic Inverter

The various parameters for tuning this MCML gate are the signal voltage swing, resistance R, and the bias voltage  $V_{BIAS}$  of the current source. Increasing the value of  $V_{BIAS}$ increases the current in the MCML gate, and speeds up the inverter, with a corresponding increase in gate power consumption as a tradeoff.

The idea to MCML gates is that by setting the signal voltage swing to be small, there will be less of slew rate limiting of the signal at the output of the gate. The value of R determines the optimal signal voltage swing of the signals passing through the gate. Therefore, the higher the resistance, the larger the maximum signal voltage swing, but as more slew rate limiting needs to be factored in, the gate will have a lower maximum frequency. There is thus another tradeoff between higher signal swing for better signal integrity and higher operating frequencies.

## 3.7.2 VCO Operating Frequency

For the PLL to even work at high frequencies, the VCO has to be able to generate high clock speeds. The MCML inverters can be cascaded to be used as ring oscillators for high frequency operation, but buffers will be required as these optimized MCML inverters tend to be unable to drive large capacitative loads. As explained previously, there is an important tradeoff between power consumption and oscillation frequency for such oscillators.

Usually, ring oscillators are unable to reach the highest design frequency specifications, and LC tank oscillators tend to be used to provide these high clocking frequencies. Specifically, higher currents can be used to boost the oscillating frequency of the LC tank oscillator. With this, there is also a design trade-off between frequency and power consumption. As a VCO designed for high frequencies can consume a high current, it is necessary to consider the power budget of the overall design before setting the oscillation frequency.

## Chapter 4

# Clock and Data Recovery Implementations

This chapter covers the different architectures implemented in this research work. The first architecture to be tested consists of a sample-and-hold architecture, followed by an adaptation into a quadrature mixer half-rate architecture, then a final improvement in the last implementation. Basically half-rate architectures [25] enable data recovery at twice the data rate of full-rate architectures, while using the same VCO clock speed.

## 4.1 A 1Gbps Simple Sample-and-Hold Architecture

The principle of the sample and hold phase detector was explained in the previous chapter. This implementation is based on the design by Anand et al. [26].



Fig. 4.1 Overall Chip Architecture Block Diagram

A chip was implemented based on the functional block diagram of Fig. 4.1. The dual ar-

rows between each block signify differential signalling. Basically in this implementation, the sample-and-hold phase detector is combined with a voltage-to-current-type charge pump and a ring oscillator.

#### 4.1.1 Design Components

#### Phase Detector

The circuit diagram of the sample-and-hold phase detector is given in Fig. 4.2.



Fig. 4.2 Sample and Hold Phase Detector

This phase detector operates similarly to a master-slave flip-flop, but uses the concept of sample-and-hold explained in Section 3.1.3. The first stage on the left side of Fig. 4.2, converts the input differential voltage signal (DATA IN) into a control signal for the current sources of the following stages. Hence, the two differential signals are used to turn on and turn off the events of the following sampling circuits.

The two sampling stages (second and third stage) are identical. Each stage consists of a differential pair, whose purpose is to hold the voltage sample on the holding capacitors. The differential pairs are activated for sampling by the simultaneous turning on of the current source below the pair, and the transistor loads above the pair. Therefore, the two differential stages are turned on alternately by voltage transitions in the data signal, and the sampled voltage level of the VCO clock signal reaches the output after two data signal transitions.

#### Charge Pump



Fig. 4.3 Voltage-to-Current Charge Pump

The charge pump used in this implementation is shown in Fig. 4.3. This voltage-tocurrent type charge pump converts the input signal into a control signal for the current source, which then moves current on and off the loop filter that is placed at the output of the charge pump. The charge pump on the left side of Fig. 4.3 shows the circuitry for this signal conversion, and the right side consists of the Common Mode Feedback (CMFB) circuit used to control the common mode voltage of the charge pump output signal. The CMFB circuit controls the common mode voltage by adjusting the biasing of the charge pump transistor loads. The common mode voltage of the charge pump output can be fixed by setting the voltage of *VBIAS* in the CMFB circuit.

#### **Ring Oscillator VCO**

A ring oscillator is used in this CDR implementation. It consists of three identical single delay cells as shown in Fig. 4.4. The oscillator in this CDR implementation consists of a ring oscillator with three delay stages, with each stage corresponding to the delay cell shown in Fig. 4.4. This delay cell makes use of an interpolation technique to adjust the delay. There are two paths to the input signal in this delay cell, with *Path2* having a longer delay consisting of two cascaded differential pairs. In contrast, the signal path of *Path1* 



Fig. 4.4 Delay Cell in VCO Charge Pump

has only one differential pair stage.

Through the technique of delay interpolation, the fine control and coarse control signals adjust the delay of the delay cell, by adjusting the currents powering the two different delay paths. By turning one delay path on with higher current preferentially to the other delay path, that delay path can be activated more than the other one. The overall delay is then more influenced by the activated delay path.

#### 4.1.2 Simulations and Measurements

#### Fabrication

A micrograph of the first fabricated chip is shown in Fig. 4.5(a), which has been implemented using the CMOS  $0.18\mu$ m process. The on-chip capacitors occupy a large area of the chip on the left, and the ring oscillator can be seen in the marked right part of the micrograph. The three delay stages of the ring oscillator can be observed to be clearly divided on the chip layout. In addition, care was taken during the layout process, to ensure proper isolation between the different cells, to prevent interference and to minimize noise.

The chip was tested using the PCB test board shown in Fig. 4.5(b). This is a test package provided by CMC for chip prototype testing, and this test board is a generic design. Thus, custom connections are needed to route the power, ground supplies and signals to the appropriate SMA connectors. This PCB has average performance, and is less suitable for high speed operation as there are package losses at such frequencies.



Fig. 4.5 Chip Micrograph and Test Board

#### Results

The results of the measurements are given in Fig. 4.6, Fig. 4.7, and Fig. 4.8.



Fig. 4.6 Transient Signal at the Output

In Fig. 4.6, the transient waveform from an oscilloscope measurement is shown. The oscilloscope used was the Tektronix TDS 8000 Digital Sampling Oscilloscope. The output signal shown in Fig. 4.6 has an amplitude of about 300 mV, and a frequency of 900 MHz.

Figure 4.7 is a graphical plot from a spectrum analyzer measuring the frequency of oscillation. The measured spectrum has a peak at 1.2935 GHz, with some observable



Fig. 4.7 Spectrum Analyzer Graph of Output Signal

sideband interferences.



Fig. 4.8 Phase Noise Measurement of the Output Signal

Lastly, in Fig. 4.8, the phase noise graph of the output VCO signal is shown. This phase noise plot is obtained using the Agilent E440A PSA Series Spectrum Analyzer, and a measured phase noise of -60.44 dBc/Hz is observed at a 1 MHz offset, at a carrier frequency of 848.1 MHz.

## 4.2 Full-Rate 5.5Gbps Quadrature Quasi-Mixer Implementation

This architecture is an improvement over the previous architecture of Section 4.1. This novel implementation in CMOS technology can run at higher frequencies due to the use of high-speed mixer architectures [27] in the speed-critical components of the design. The basic block diagram of the overall architecture is shown in Fig. 4.9.



Fig. 4.9 Quadrature Mixer Half-Rate Architecture

The input data signal is fed into the two phase detectors. These two phase detectors are both clocked by the VCO, except one of them is clocked with a 90° phase-shifted VCO signal from the quadrature VCO clock. The outputs of the two phase detectors are then fed into a frequency detector.

The frequency detector helps pull the loop frequency towards the data frequency, whereas the phase detector helps maintain a constant phase difference between the data signal and the VCO clock signal. The frequency detector thus produces an additional charge pump control signal to aid the PLL lock. Therefore, the charge pump needs to have two sets of differential control signals, one from the phase detector and the other one from the frequency detector.

#### 4.2.1 Design Components

#### Phase Detector

This section describes the details of this implementation. The VCO runs at a frequency of around 5.5 GHz, and the data runs at around 5.5 Gbps. The CDR components of this



implementation, that are represented in Fig. 4.9 are described next.

Fig. 4.10 Phase Detector

The phase detector of the architecture is given in Fig. 4.10. This circuit is an improvement over the simpler sample and hold architecture of Fig. 4.2. It basically works as a type of dual-edge triggered flip-flop that samples the voltage value of the clock and each data transition of the input data. However, this circuit operates in a quasi-mixer mode, as the voltage signals are kept small to enable higher speeds of operation. As the phase detector is required to work at a high frequency, transistors of smaller widths are used throughout the flip-flop.

The first stage consists of two quasi-mixer cells which feed the second stage mixer cell. The NRZ data is thus mixed with the clock in these two stages. Overall, this circuit produces an output beat signal that has a voltage level corresponding to the phase difference between the data and clock signals. Finally, the third stage is a buffer stage that is required because the small transistors of the previous stage cannot sufficiently drive the output loads.



Fig. 4.11 Frequency Detector

#### **Frequency Detector**

The frequency detector of Fig. 4.11 acts as a high speed detector of the difference in frequency, but only in combination with the two phase detectors fed with a VCO clock in quadrature. The two signals from the PD and QPD phase detectors are mixed, and used to produce a beat signal that depend on whether the frequency of the VCO clock is higher or lower than the clock of the data signal.

The circuits of the first stage mix the QPD and PD inputs, and are then multiplexed by the second stage circuit. The output of this frequency detector has three states. If the frequency of the VCO clock and the NRZ data are close enough to each other, the outputs are kept high. If the frequency of the VCO is higher or lower than that of the data, one of the outputs will oscillate.

Specifically, if the frequency of the VCO is greater than the frequency of the data signal, the signal of FDdown stays at 1.8V, while the signal of FDup beats at a specific frequency. This beat frequency is dependent on the magnitude of the difference in frequencies between the VCO and data. On the other hand, the situation is reversed when the frequency of the data signal is greater than the frequency of the VCO.
#### Charge Pump



Fig. 4.12 Charge Pump

Charge pumps are needed to supply currents to, and sink currents from, the loop filter. Most charge pumps use digital signals to control charges movement. Therefore, work on charge pumps tends to emphasize the accuracy of the circuit using digital control signals. However, this is not necessarily suitable in high-speed phase locked loops, as the elimination of the digital control signals would reduce voltage spiking. For the circuit of Fig. 4.12, the charge pump moves incremental charges depending on the amplitude and frequency of the controlling input signals, in contrast to the operation of the time-controlled fixed-current charge pumps.

An important design issue is the difficulty of matching the pull-up and pull-down currents of the charge pump. The use of complementary current mirrors, as in Fig. 4.12, keeps the pull-up and pull-down currents reasonably equal. This charge pump is fully balanced by the arrangement of the differential pairs, and is minimally affected by process variations and design imbalances.

| Condition            | Contu     | Contd     |
|----------------------|-----------|-----------|
| FDup and FDdown high | No change | No change |
| FDup drops low       | Rises     | Falls     |
| FDdown drops low     | Falls     | Rises     |
| PD high and PDb low  | Falls     | Rises     |
| PDb low and PD high  | Rises     | Falls     |

 Table 4.1
 Charge Pump Operation

The charge pump in Fig. 4.12 has two controlling differential inputs from the phase detector (PD and PDb) and the frequency detector (FDup and FDdown). The differential output of this charge pump is at Contu and Contd. The frequency detector moves differential charges as long as there is a frequency difference. The FDup and FDdown signals are both high at the same time by default, if there is no frequency shifting required, as detected by the FD. However, if the frequency drifts, one of the voltages of FDup or FDdown will drop, and will result in frequency correction. The PD moves charges depending on the amount of phase difference, and produces a beat signal that moves the VCO clock into phase with the input data signal. A summary of the charge pump operation is given in Table. 4.1.

The common mode feedback circuit is also shown in the middle of Fig. 4.12, and this circuit keeps the common mode level of the output of the charge pump at a constant level. The loop filter that is connected to the charge pump is on the right side of Fig. 4.12.



#### NMOS LC-tank oscillator

Fig. 4.13 Quadrature Voltage Controlled Oscillator

Instead of the ring oscillator used in the previous architecture of Section 4.1, an LC-tank oscillator is used that is given in Fig. 4.13. Higher Q factors are achievable with an LC-tank oscillator, and they are able to drive larger capacitative loads. There are four quadrature outputs, which are *Outa*, *Outb*, *Outc* and *Outd*. This quadrature VCO uses two tied NMOS cross-coupled pairs to create the signals in quadrature, which are then used to drive the

two phase detectors.

Sufficiently high gain in the cross-coupled pair is required to cause sufficient instability and cause oscillation. As the NMOS cross-coupled pairs generate the main part of the oscillation, they have relatively larger widths compared to the PMOS pair. The PMOS pairs in this oscillator are optional, and are placed there just to keep the VCO signal oscillating between the rail voltage (VDD) and ground. They are not sized sufficiently large enough to contribute the necessary gain ( $g_m$ ), required for oscillation. If the PMOS pairs were omitted, the VCO signal would have a DC level equal to the rail voltage, and would require a following DC voltage shifter stage, before the VCO signal can be used elsewhere in the PLL. Note that the inductors of this VCO are integrated on the same silicon chip.

Front Stage for Data Input



Fig. 4.14 Single-Ended to Differential Converter

The input stage for the data input is shown in Fig. 4.14. This circuit converts a singleended signal into a differential signal. It is used as the input stage, as it is difficult to generate a differential input NRZ data signal. This stage uses a two stage differential pair with active load, to generate the fully differential signal. Finally, a pair of inverters is used to buffer the output signal to the following stage.

#### 4.2.2 Simulations and Measurements

#### Fabrication

This modified CDR was implemented in a cmosp18 process with the micrograph shown in Fig. 4.15.



Fig. 4.15 Chip Micrograph

The on-chip inductors are shown at the top-left, and the 100 pF capacitors are at the bottom of Fig. 4.15. The large capacitances of the loop filter and the large inductor required for the lower frequency, dominate the area of the die.

#### Simulations



Fig. 4.16 Control Signal Lock Simulation

In Fig. 4.16, the simulation shows that the control signal of the VCO locks to the input data frequency under simulation. Notice that the movement of the control signal is linear, and this is due to the improved linear charge pump, which provides constant current pumping to the loop filter. The locking time in this simulation is about 600 ns.

#### Measurements

The results of measurement for this implementation are shown in Fig. 4.17, Fig. 4.18, and Fig. 4.19.



Fig. 4.17 Transient Signal at the Output

In Fig. 4.17, the transient waveform displayed on the Tektronix TDS 8000 Digital Sampling Oscilloscope is shown. The output signal has an amplitude of about 14 mV, and a frequency of 5.586 GHz. The output signal has a lower voltage amplitude, as the output stage was designed to sacrifice amplitude for a higher speed of operation.



Fig. 4.18 Spectrum Analyzer Graph of Output Signal

In Fig. 4.18, a graphical plot from a spectrum analyzer measuring the frequency of oscillation shows a peak at 5.76 GHz, with small sideband interferences.



Fig. 4.19 Phase Noise Measurement of the Output Signal

In Fig. 4.19, the phase noise graph of the output VCO signal is shown. This phase noise plot has a measured phase noise of -70.32 dBc/Hz, observed at a 1 MHz offset for a carrier frequency of 5.762 GHz. Note the lower phase noise compared to the first design, despite a five time increase in the operating frequency.

### 4.3 Modified Half-Rate 13Gbps Quadrature Quasi-Mixer Clock Recovery (CRC) Implementation

#### 4.3.1 Design Components

#### **Phase Detector**

In order to achieve a higher data rate, modifications to the previous design of Section 4.2 were made. The phase detector and the VCO have been replaced to work at higher frequencies.



Fig. 4.20 Phase Detector

The phase detector of Fig. 4.9 does not work at half rate, and a modified phase detector using the same sample and hold principle is used, which is shown in Fig. 4.20. In Fig. 4.9, the non-return to zero (NRZ) signal is fed in as DATA, and the VCO signal is fed in as CLK. Nodes in Fig. 4.20 (B1) like Q1, Q1b, Vbias and CLKb are connected to other nodes with the same labels in the two circuit blocks B2 and B3. Both the upper and lower circuits operate like mixers in the small signals regime, mixing the data input with the clock signal. Proper sizing of the transistors attached to the clock signal ensures mixer operation.

Once more, a differential buffer B3 is required to buffer the output signal and regenerate

the small voltage signals, as small transistor sizes have been used to increase the speed of operation. Overall, this circuit produces an output beat signal that has a voltage level corresponding to the phase difference between the data and clock signals.

Fully Complementary LC-Tank VCO



Fig. 4.21 Complementary Quadrature VCO

The VCO of Fig. 4.13 has been replaced with the faster VCO of Fig. 4.21 using a fully complementary architecture with PMOS cross-coupled pairs and buffers to balance the NMOS part of the circuit. In contrast to the VCO of Fig. 4.13, the PMOS cross-coupled pairs are sized large enough to provide sufficient gain  $(g_m)$  and ensure oscillation. By using properly gain-balanced complementary pairs in contrast to NMOS-only or PMOS-only oscillators, the largest transistor size is reduced. Overall, the widths of the NMOS and PMOS transistors will be similar, and this allows a more balanced VCO layout on chip. The frequency of this VCO is tuned using varactors that are controlled differentially, as described in the next section. The bias current of this VCO is also set higher to increase the frequency of oscillation. Again, the inductors used are on-chip inductors.

#### A Differential Varactor

Typically, a VCO is tuned using only one type of varactor [28]. In this work, two types of varactors are combined. Namely, an N+ Nwell varactor and a PMOS varactor, as shown in



Fig. 4.22 Differential Varactor Control in VCO

Fig. 4.22. The differential varactors used in the VCO are derived from a combination of a PMOS varactor and a N+ Nwell varactor, and is set up as shown in Fig. 4.22. This enables a unique differential control, and the combination can be viewed as a new differential varactor. The operation of the varactor is summarized in Table. 4.2.

 Table 4.2
 Effect of Varactor Control on VCO

| Control Line       | Varactor Used | Capacitance at VCO | Frequency of VCO |
|--------------------|---------------|--------------------|------------------|
| $V_{Control_Up}$   | PMOS          | Increases          | Decreases        |
| $V_{Control_Down}$ | N+ Nwell      | Decreases          | Increases        |

One of the benefits of this varactor is the inherent wider effective range of voltage control. Instead of only having 1.8 V, as in the case of a single-ended control voltage, one now has effectively twice the voltage control range when using this differential varactor.

#### **Output Buffer**



Fig. 4.23 VCO Output Buffer

As the VCO is sensitive to loading, an output buffer is required to transmit the signal off-chip. The circuit of Fig. 4.23 is used to buffer the VCO signal off-chip. It is a cascade stage using PMOS, that lowers the overall DC voltage level of the buffered off-chip VCO signal.

Using a current bias in the extra transistor on the right of Fig. 4.23, lowers the driving requirement of the cascade stage. These two design features effectively lower the power requirement of the output buffer, and hence allows the output stage to drive an off-chip output node at a higher frequency.

#### 4.3.2 Simulations and Measurements

#### Fabrication

The cmosp18 chip of this modified design of a clock recovery circuit (CRC) is shown in Fig. 4.24.



Fig. 4.24 Chip Micrograph

The on-chip inductors are at the bottom-right, and the 80 pF capacitors are at the left of Fig. 4.24. With the higher frequency of operation, the inductors have higher quality factors on chip. The loop filter capacitance is also smaller, as a higher VCO frequency allows a higher loop filter bandwidth.



Fig. 4.25 Control Signal Lock Simulation

#### Simulations

The circuit simulation in Fig. 4.25 shows a lock on two different input data rates. The first time period of 1 $\mu$ sec shows a lock onto a 13 Gbps input data, and the second 1  $\mu$ sec period shows a lock onto a 13.6 Gbps input data. As can be seen from Fig. 4.25, the CRC has a fast locking time of about 0.5  $\mu$ sec. The small voltage oscillations on top of the control signal have an amplitude of about 90 mV, and are due to the constant beating of the phase detector signal.

Test Balun



Fig. 4.26 Test PCB and Balun

As a fully differential signal was required as the input to this chip for test purposes, and a fully differential signal source is unavailable, a custom PCB was required. The fabricated PCB is shown in Fig. 4.26, and a balun is visible at the top of the PCB. The custom designed balun is used to convert a single ended input into a differential signal, with the following parameter

$$r = \frac{3 \times 10^8}{\frac{4\pi}{3} * f_c * \sqrt{\varepsilon_r}} \quad . \tag{4.1}$$

The radius of the balun is designed using Eq. (4.1), where r is the radius of the ring,  $\varepsilon_r$  is the relative permittivity and  $f_c$  is the central operating frequency of the balun, and

$$W = \frac{t}{(\frac{e^k}{8} - \frac{1}{4e^k})} \text{ where } k = \frac{Z_0 \sqrt{2(\varepsilon_r + 1)}}{119.9} + \frac{1}{2} (\frac{\varepsilon_r - 1}{\varepsilon_r + 1}) (ln\frac{\pi}{2} + \frac{1}{\varepsilon_r} ln\frac{4}{\pi}).$$
(4.2)

The impedance of the balun ring and balun leg traces had to be 75 $\Omega$  and 50 $\Omega$  respectively. The width of the ring and leg microstrips can be found using the microstrip impedance Eq. (4.2), where t is the thickness of the substrate and  $Z_0$  is the microstrip trace impedance [29].

The values for the balun in the FR4-62" glass cloth base epoxy resin and flame retardant copper clad laminate, are given below. The MATLAB code for performing the calculations is given in Appendix A.2.

```
Relative Permittivity = 4.5
Substrate Thickness = 62 mils
Central Operating Frequency = 6.7 GHz
Ring Radius = 198.38 mils
Width of Leg = 116.56 mils
Width of Ring = 54.29 mils
```

#### **Measurement Results**

Fig. 4.27 shows the measured VCO frequency versus the differential voltage control signal. A wide VCO tuning range of 1.5 GHz is obtained.



Differential Voltage Control of VCO

Fig. 4.27 Differential Voltage Control

The graph of Fig. 4.28 shows the effect of controlling each varactor individually, with VContu controlling the PMOS varactor and the VContd controlling the N+ Nwell varactor. As shown on the graph, the varactors act in opposite manner to the changing voltage, thus enabling differential control when used in tandem.



Fig. 4.28 Control Using a Single Varactor

The measured performance of the CRC is summarized in Fig. 4.29, Fig. 4.30. and

| lia hat yere bata, titab | ta ElesiTriggered Waveforms=D                  | This wis        |
|--------------------------|------------------------------------------------|-----------------|
| ●☆ A あ X C ■             | Acq Made Average 💌 Trig External Prescaies 💌 _ |                 |
| Ampitude 👻 🕱             |                                                |                 |
|                          |                                                | Weivotien m     |
|                          |                                                | C1 20 DOWARA    |
|                          |                                                | Concore (Mo. C1 |
|                          |                                                | v1 87.57mV      |
| 8                        |                                                | V2 101,1/8V     |
| ~ 2                      |                                                |                 |
|                          |                                                |                 |
|                          |                                                |                 |
|                          |                                                | :               |
| a second and a second    |                                                |                 |
|                          |                                                |                 |
| 1                        |                                                |                 |
|                          |                                                |                 |
|                          |                                                |                 |
|                          |                                                |                 |
|                          |                                                |                 |
| P                        |                                                |                 |
|                          |                                                |                 |
| 1                        |                                                |                 |
|                          |                                                |                 |
|                          |                                                |                 |

Fig. 4.31. Notice that the VCO signal is rather clean.

Fig. 4.29 Output Signal of the CRC

In Fig. 4.29, the transient waveform displayed on the oscilloscope is shown. The recovered clock signal has an amplitude of about 114 mV, and a frequency of 6.1 GHz. Once again, the output signal has a low amplitude, as the output stage was designed to compromise for higher speed.



Fig. 4.30 Spectrum Analyzer Graph of the Output Signal of the CRC

In Fig. 4.30, the spectrum analyzer plot measuring the frequency of oscillation shows a peak at 7.563 GHz, with minimal sideband interferences.



Fig. 4.31 Phase Noise Measurement for the Output Signal

In Fig. 4.31, the phase noise graph of the output VCO signal is shown. This phase noise plot has a measured phase noise of -103.18 dBc/Hz at a 1 MHz offset, at a carrier frequency of 7.58 GHz. This is considerably better than the -70 dBc/Hz obtained from the previous design of 5 GHz.

### 4.4 Comparison of Chip Implementations

A comparison of the different chip implementations is given in Table. 4.3.

| Chip Presented i | in Section           | n Data Rate |       | VCO Clock        |    | Tuning Range        |  |
|------------------|----------------------|-------------|-------|------------------|----|---------------------|--|
| 4.1              | 1 G                  |             | bps   | 1 GHz            |    | 0.72-1.3 GHz        |  |
| 4.2              | 5.5 Gbps             |             | Gbps  | 5.5 GHz          |    | $5-6  \mathrm{GHz}$ |  |
| 4.3              |                      | 13 Gbps     |       | 6.5 GHz          |    | 5.9-7.5 GHz         |  |
| Chip Presentee   | d in Section Voltage |             | tage  | Total Current    |    | Die Size            |  |
| 4.1              |                      | 1.          | 8V    | 20 mA            |    | 1.4x1.3mm           |  |
| 4.2              |                      | 1.          | 8V    | $23 \mathrm{mA}$ |    | 1.5x1.4mm           |  |
| 4.3              |                      | 1.          | 8V    | 36 mA            |    | 1.5x1.5mm           |  |
| Chip Presented   | Phase Noise          |             | Ph    | ase Noise        | ]  | Phase Noise         |  |
| in Section       | at 100kHz            |             | a     | t 1MHz           |    | at 10MHz            |  |
| 4.1              | -57.60 dBc/Hz        |             | -60.4 | 44 dBc/Hz        | -1 | 06.50  dBc/Hz       |  |
| 4.2              | -63.70 dBc/Hz        |             | -70.3 | 32 dBc/Hz        | -1 | 16.51 dBc/Hz        |  |
| 4.3              | -56.8 dBc/Hz         |             | -103. | 78 dBc/Hz        | -1 | 21.57 dBc/Hz        |  |

 Table 4.3
 Design Comparison Table

As can be seen from Table. 4.3, the last implementation has the highest data rate, and the best oscillator performance in terms of phase noise. However, the current in the last implementation is also the highest, with the majority of the current being consumed by the oscillator.

The fastest implementation in Section 4.3 is comparable to recent work published in the literature. A comparison to other work is given in Table. 4.4.

| Work       | Savoj'01 [30]   | Hu'03 [31]       | Hiok-Tiaq'03 [32]       | CRC in this work |
|------------|-----------------|------------------|-------------------------|------------------|
| Technology | $0.18 \ \mu m$  | $0.25~\mu{ m m}$ | $0.18 \ \mu \mathrm{m}$ | $0.18 \ \mu m$   |
| Voltage    | 1.8 V           | 3.3 V            | 1.8 V                   | 1.8 V            |
| Power      | 91 mW           | 132  mW          | 80  mW                  | $69 \mathrm{mW}$ |
| Data Rate  | $10 { m ~Gb/s}$ | $1.25~{ m Gb/s}$ | $3.125~{ m Gb/s}$       | 13  Gb/s         |

Table 4.4Comparison of to the Literature

A summary of the performance of the fastest chip of Section 4.3 is given in Table. 4.5.

|                            | <u> </u>             |
|----------------------------|----------------------|
| Parameter                  | Value                |
| Average Data Rate          | 13 Gbps              |
| VCO Frequency of Operation | 6.1 to $7.6$ Ghz     |
| Capture Range              | 1.0 GHz              |
| Power Supply               | 1.8 V                |
| Locking Time               | $0.5 \ \mu { m sec}$ |
| Phase Noise @7.58GHz       | -103.18 dBc/Hz @1MHz |
| Power Consumption          | $69 \mathrm{mW}$     |
| Die Area                   | 1.5 mm X 1.5 mm      |
|                            |                      |

 Table 4.5
 Performance Summary of the Chip in Section 4.3

# Chapter 5

# Conclusion

The work in this thesis presented the development of clock and data recovery architectures using a  $0.18\mu$ m TSMC process. The analysis of the CDR, and the theoretical understanding of this unique circuit have been presented. Following, actual circuit implementations, and the presentation of fabricated chips and measurements have been presented.

The first prototype used a simple sample-and-hold architecture with a ring oscillator, and the second prototype used a quadrature mixer architecture with an LC-tank oscillator. The last prototype was an improvement over the second, with a fully differential architecture working at half-rate with 13 Gbs NRZ input data, and was tested using a custom made PCB using FR4-62" laminate from Nan Ya Plastic Corporation.

The work proved the feasibility of realizing a complete very high speed CDR on a monolithic chip, and the possibility of its implementation in CMOS. Also, we demonstrated that it was possible to have entire system clocking circuits on a single, relatively smallsized die, without having to use off-chip inductors or capacitors. Overall, this opens up the possibility of further larger implementations with more functionality.

### 5.1 Future Improvements

There are further enhancements that can be made to this work. For example, the addition of a retiming circuit to resample the data was omitted to ease testability. However, this could be added in future iterations of the design process. Also, the architecture of the last prototype can be extended to a quarter-rate architecture by doubling the phase detector, and using an eight-phase oscillator. These improvements could further add functionality and push the performance envelope of the CDR architectures investigated in this thesis.

# Appendix A

# Matlab Code

### A.1 PRBS Generation using LSFR

This matlab code uses the linear feedback shift register technique to generate a pseudorandom binary sequence. The LFSR uses a Fibonacci-type maximum sequence length polynomial for a 16-bit register. The polynomial used in the LFSR code is given in Eq. (A.1)

$$p(x) = X^{16} + X^{12} + X^3 + X^1$$
(A.1)

```
b_length = 16; %bit length
seq_length = 40000; %number of cycles
Test_bit = 0;
Test_value = 20113;
%Specifying output vector
High_V = 1.2;
Low_V = -1.2;
Frequency = 1.82e9;
Rise_time = 10e-12;
Pulse_time = 1/(2*Frequency) - Rise_time;
```

```
%initial vector
store = ones(1,b_length);
```

```
out_index=1;
time=0;
for indexa = 1:seq_length,
    %testing for no repetition
    value = 0;
    for indexd = 1:b_length,
        value = value + 2<sup>(indexd-1)*store(indexd);</code></sup>
    end;
    if value == Test_value,
        Test_bit = Test_bit+1;
    end;
    %storing LFSR sequence
    sequence(indexa) = value;
    %output file
    output(out_index,1) = time;
    output((out_index+1),1) = time + Pulse_time;
    if store(16) == 1
        output(out_index,2) = High_V;
        output((out_index+1),2) = High_V;
    else
        output(out_index,2) = Low_V;
        output((out_index+1),2) = Low_V;
    end;
    time = time + Pulse_time + Rise_time;
    out_index = out_index + 2;
```

```
for indexb = 1:b_length,
    %Fibonnaci LFSR Algorithm
    temp = xor(store(16),store(12));
    temp = xor(temp,store(3));
    temp = xor(temp,store(1));
    for indexc = 1:(b_length-1),
        store(b_length-indexc+1) = store(b_length-indexc);
    end;
    store(1) = temp;
end;
end;
save pwlfile_2us.txt output -ascii
```

### A.2 Balun Calculations

```
clear;
zring = 75;
zleg = 50;
fc = 6.7e9
er = 4.5
t = 62
% Widths and radius are given in mils
k = (zring/119.9)*(2*(er+1))^0.5 + (0.5 * (er-1)/(er+1)
 * (log(pi/2) +log(4/pi)/er));
wring = t *(1/(exp(k)/8 - 1/(4*exp(k))))
k = (zleg/119.9)*(2*(er+1))^0.5 + (0.5 * (er-1)/(er+1)
 * (log(pi/2) +log(4/pi)/er));
wleg = t *(1/(exp(k)/8 - 1/(4*exp(k))))
radius = 3e8/(25.4e-6 * fc * sqrt(er) * (4*pi/3))
```

## References

- [1] H. G. F. Martin Johnsson, "Hiperlan2 the broadband radio transmission technology." http://www.hiperlan2.com/MJBroadBandOverview.asp, 2001.
- [2] V. F. Kroupa, Frequency Synthesis; Theory, Design & Applications. John Wiley, second ed., 1973.
- [3] B. Razavi, Monolithic phase-locked loops and clock recovery circuits. IEEE Press, New York, 1996.
- [4] U. L. Rohde, *Microwave and Wireless Synthesizers*. John Wiley, second ed., 1997.
- [5] U. L. Rohde, *Digital PLL Frequency Synthesizers: Theory and Design*. Prentice-Hall, second ed., 1983.
- [6] W. F. Egan, Frequency Synthesis by Phase Lock. John Wiley, second ed., 2000.
- [7] J. Craninckx and M. Steyaert, Wireless CMOS Frequency Synthesizer Design. Kluwer Academic Publishers, 1998.
- [8] F. Gardner, "Charge-pump phase-locked loops," *IEEE Trans. Comm.*, vol. COM-28, pp. 1849–1858, Nov. 1980.
- [9] J. A. Crawford, Frequency Synthesizer Design Handbook. Artech House, 1994.
- [10] A. Mehrotra, "Noise analysis of phase-locked loops," in IEEE/ACM International Conference on Computer Aided Design, pp. 277–282, Nov. 2000.
- [11] B. Razavi, Design of Integrated Circuits for Optical Communications. McGraw-Hill, 2002.
- [12] J.-K. Kang and D.-H. Kim, "A CMOS clock and data recovery with two-XOR phasefrequency detector circuit," in *IEEE International Symposium on Circuits and Systems*, vol. 4, pp. 266–269, May 2001.

- [13] M. Ramezani and C. Salama, "An improved bang-bang phase detector for clock and data recovery applications," in *IEEE International Symposium on Circuits and Sys*tems, vol. 1, pp. 715–718, May 2001.
- [14] J. Charles R. Hogge, "A self correcting clock recovery circuit," *IEEE Journal of Lightwave Technology*, vol. LT-3, pp. 1312–1314, Dec. 1985.
- [15] J. D. H. Alexander, "Clock recovery from random binary signals," *Electronics Letters*, vol. 11, pp. 541–542, Oct. 1975.
- [16] H. Djahanshahi and C. Salama, "Differential CMOS circuits for 622-Mhz/933-Mhz clock and data recovery applications," *IEEE Journal of Solid-State Circuits*, vol. 35, pp. 847–855, June 2000.
- [17] A. D. K. Suyama, "A 1.9-GHz CMOS VCO with micromachined electromechanically tunable capacitors," *IEEE Journal of Solid-State Circuits*, vol. 35, pp. 1231–1237, Aug. 2000.
- [18] P. Andreani and S. Mattisson, "On the use of MOS varactors in RF VCOs," IEEE Journal of Solid-State Circuits, vol. 35, pp. 905–910, June 2000.
- [19] R. K. J. Maget and M. Tiebout, "A physical model of a CMOS varactor with high capacitance tuning range and its application to simulate a voltage controlled oscillator," *International Semiconductor Device Research Symposium*, pp. 609–612, Dec. 2001.
- [20] W. Wong, P. S. Hui, Z. Chen, K. Shen, J. Lau, P. Chan, and P.-K. Ko, "A wide tuning range gated varactor," *IEEE Journal of Solid-State Circuits*, vol. 35, pp. 773–779, May 2000.
- [21] D. A. Johns and K. Martin, Analog Integrated Circuit Design. John Wiley, first ed., 1996.
- [22] P. R. Gray, P. J. Hurst, S. H. Lewis, and R. G. Meyer. Analysis and Design of Analog Integrated Circuits. John Wiley, fourth ed., 2000.
- [23] M. Perrott, "High speed communication circuits and systems, Spring 2003." http://ocw.mit.edu/OcwWeb/Electrical-Engineering-and-Computer-Science/ 6-976High-Speed-Communication-Circuits-and-SystemsSpring2003/ CourseHome/index.htm, 2003.
- [24] A. Ismail and M. Elmasry, "A low power design approach for MOS current mode logic," in *IEEE International Systems-on-Chip Conference*, pp. 143–146, Sept. 2003.

- [25] P. Larsson, "An offset-cancelled CMOS clock-recovery/demux with a half-rate linear phase detector for 2.5 Gb/s optical communication," in *ISSCC*, pp. 74–75, 434, Feb. 2001.
- [26] S. Anand and B. Razavi, "A CMOS clock recovery circuit for 2.5-Gb/s NRZ data," *IEEE Journal of Solid-State Circuits*, vol. 36, pp. 432–439, Mar. 2001.
- [27] U. L. A. Pottbacker and H.-U. Schreiber, "A Si bipolar phase and frequency detector IC for clock extraction up to 8 Gb/s," *IEEE Journal of Solid-State Circuits*, vol. 27, pp. 1747–1751, Dec. 1992.
- [28] P. Andreani and S. Mattisson, "A 1.8-GHz CMOS VCO tuned by an accumulationmode MOS varactor," in *IEEE International Symposium on Circuits and Systems*, vol. 1, pp. 315–318, May 2000.
- [29] T. Edwards and M. Steer, Foundations of Interconnect and Microstrip Design. John Wiley, third ed., 2000.
- [30] J. Savoj and B. Razavi, "A 10-Gb/s CMOS clock and data recovery circuit with a halfrate linear phase detector," *IEEE Journal of Solid-State Circuits*, vol. 6, pp. 761–768, May 2001.
- [31] Y. Hu and Z.-G. Wang, "1.25-Gb/s 0.25-um CMOS clock recovery based on phase and frequency locked loop," in *IEEE Conference on Electron Devices and Solid-State Circuits*, pp. 179–182, Dec. 2003.
- [32] H.-T. Ng, R. Farjad-Rad, M.-J. Lee, W. Dally, T. Greer, J. Poulton, J. Edmondson, R. Rathi, and R. Senthinathan, "A second-order semidigital clock recovery circuit based on injection locking," *IEEE Journal of Solid-State Circuits*. vol. 38, pp. 2101– 2110, Dec. 2003.
- [33] W.-H. Lee, J.-D. Cho, and S.-D. Lee, "A high speed and low power phase-frequency detector and charge-pump," in *Proceedings of the ASP-DAC*, vol. 1, pp. 269–272, Jan. 1999.
- [34] R. Chang and L.-C. Kuo, "A differential type CMOS phase frequency detector," in Proceedings of the Second IEEE Asia Pacific Conference on ASICs, pp. 61–64, Aug. 2000.
- [35] M. Rau, T. Oberst, R. Lares, A. Rothermel, R. Schweer, and N. Menoux, "Clock/data recovery PLL using half-frequency clock," *IEEE Journal of Solid-State Circuits*, vol. 32, pp. 1156–1159. July 1997.

- [36] A. Mostafa and M. El-Gamal, "A 12.5 GHz back-gate tuned CMOS voltage controlled oscillator," in *IEEE International Conference on Electronics, Circuits and Systems*, vol. 1, pp. 243–247, Dec. 2000.
- [37] K.-H. Chen, H.-S. Liao, and L.-J. Tzou, "A low-jitter and low-power phase-locked loop design," in *IEEE International Symposium on Circuits and Systems*, vol. 2, pp. 257– 260, May 2000.
- [38] H. Wang and R. Nottenburg, "A CMOS low-jitter phase frequency detector for gigabit/s clock recovery," in *International Workshop on Design of Mixed-Mode Integrated Circuits and Applications*, pp. 91–93, July 1999.
- [39] W. Hui and R. Nottenburg, "A 0.7-1 Gb/s CMOS clock recovery circuit," in IEEE Asia Pacific Conference on ASICs, pp. 291–294, Aug. 1999.
- [40] J. Hansryd, P. Andrekson, and B. Bakhshi, "Prescaled clock recovery based on small timing misalignment of data pulses," *Journal of Lightwave Technology*, vol. 19, pp. 105–113, Jan. 2001.
- [41] T.-Y. Hsu, B.-J. Shieh, and C.-Y. Lee, "An all-digital phase-locked loop (ADPLL)based clock recovery circuit," *IEEE Journal of Solid-State Circuits*, vol. 34, pp. 1063– 1073, Aug. 1999.
- [42] A. Demir and P. Feldmann, "Stochastic modeling and performance evaluation for digital clock and data recovery circuits," in *Proceedings*, Design, Automation and Test in Europe Conference and Exhibition, pp. 340–344, Mar. 2000.
- [43] M. Meghelli, B. Parker, and H. A. M. Soyuer, "SiGe BiCMOS 3.3-V clock and data recovery circuits for 10-Gb/s serial transmission systems," *IEEE Journal of Solid-State Circuits*, vol. 35, pp. 1992–1995, Dec. 2000.
- [44] K. Kishine, N. Ishihara, K. Takiguchi, and H. Ichino, "A 2.5-Gb/s clock and data recovery IC with tunable jitter characteristics for use in LANs and WANs," *IEEE Journal of Solid-State Circuits*, vol. 34, pp. 805–812, June 1999.
- [45] K. Ishii, K. Kishine, and H. Ichino, "A jitter suppression technique for a 2.48832 Gb/s clock and data recovery circuit," in *IEEE International Symposium on Circuits and* Systems, vol. 5, pp. 261–264, May 2000.
- [46] A. Pallotta, F. Centurelli, and A. Trifiletti, "A low-power clock and data recovery circuit for 2.5 Gb/s SDH receivers," in *International Symposium on Low Power Elec*tronics and Design, pp. 67 – 72, July 2000.

- [47] K. Murata and T. Otsuji, "A novel clock recovery circuit for fully monolithic integration," *IEEE Transactions on Microwave Theory and Techniques*, vol. 47, pp. 2528– 2533, Dec. 1999.
- [48] M. Wurzer, J. Bock, H. Knapp, W. Zirwas, F. Schumann, and A. Felder, "A 40-Gb/s integrated clock and data recovery circuit in a 50-GHz ft silicon bipolar technology," *IEEE Journal of Solid-State Circuits*, vol. 34, pp. 1320–1324, Sept. 1999.
- [49] L. Wu, H. Chen, S. Nagavarapu, R. Geiger, E. Lee, and W. Black, "A monolithic 1.25 Gbits/sec CMOS clock/data recovery circuit for fibre channel transceiver," in *IEEE International Symposium on Circuits and Systems*, vol. 2, pp. 565–568, May 1999.