# Circuits for Time and Frequency Domain Characterization of Jitter

by
Antonio Chan, B. Eng. 1999

Department of Electrical Engineering

McGill University, Montréal



July 2002

A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfilment of the requirements for the degree of Master of Engineering



National Library of Canada

Acquisitions and Bibliographic Services

395 Wellington Street Ottawa ON K1A 0N4 Canada Bibliothèque nationale du Canada

Acquisisitons et services bibliographiques

395, rue Wellington Ottawa ON K1A 0N4 Canada

> Your file Votre référence ISBN: 0-612-85883-9 Our file Notre référence ISBN: 0-612-85883-9

The author has granted a nonexclusive licence allowing the National Library of Canada to reproduce, loan, distribute or sell copies of this thesis in microform, paper or electronic formats.

The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.

L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfiche/film, de reproduction sur papier ou sur format électronique.

L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou aturement reproduits sans son autorisation.



## **Abstract**

Jitter characterization has become significantly more important for systems running at multi-gigahertz data rates. Time and frequency domain characterization of jitter is thus a crucial element for system specification testing. Time domain jitter measurement on a data signal with sub-gate timing resolution can be achieved using two delay chains feeding into the clock and data lines of a series of D-latches known as a Vernier Delay Line (VDL). An important drawback to the VDL structure is that its measurement accuracy depends on the matching of the various delay elements. Although careful layout techniques can help to minimize these mismatches, it cannot eliminate them completely. As well, due to the nature of the design, a relatively large silicon area is required for silicon implementation. In this work, a novel technique is developed which reduces the silicon area requirements by two orders of magnitude, as well enables the measurement device to be synthesized from a register transfer level (RTL) description. A custom IC was designed and fabricated in a 0.18 µm CMOS process as a first proof of concept. The design requires a silicon area of 0.12 mm<sup>2</sup> and measured results indicate a timing resolution of 19 ps. The synthesizable nature of the design is demonstrated using a FPGA implementation. In addition, another custom IC was designed and fabricated in a 0.35  $\mu$ m CMOS process as a frequency characterization circuit to process and extract information from the data obtained from the VDL. This design occupies a silicon area of 1.83 mm<sup>2</sup>. As test time is an important consideration for a production test, an extension to this component-invariant VDL technique is provided that reduces test time at the expense of more hardware. Finally, a method for obtaining the frequency domain characteristics of the jitter using the VDL will also be given.

# Résumé

La caractérisation du phénomène de jitter est devenue de plus en plus importante pour des systèmes électroniques à des vitesses de plusieurs Gigahertz. La caractérisation en temps et en fréquence du jitter est donc un élement critique pour le test et l'évaluation des systèmes. La mesure du jitter en temps avec une résolution inférieure au délai d'une porte logique peut être accomplie grâce à deux chaînes de délai qui alimentent les entrées 'horloge' et 'donnée' d'une série de bascules D. Le tout est connu comme la Ligne de Délai Vernier (LDV). Un inconvénient de la structure LDV est que sa précision dépend du matching des élements de délai, c'est-à-dire du niveau auquel ces élements sont identiques. Bien que ces différences peuvent être minimisées avec un layout aussi précis que possible, elles ne peuvent pas être complètement éliminées. De plus, l'espace de silicon éxigé pour le layout de ces élements de délai est en géneral relativement grand. Cette thèse propose une nouvelle technique qui réduit le volume de silicon nécessaire par deux ordres de grandeur, et permet de synthétiser le circuit à partir de d'une description Register Transfer Level (RTL). Pour démontrer le concept, un circuit integré (CI) a été conçu et fabriqué dans une technologie de CMOS 0.18 µm. Le prototype occupe une surface de 0.12 mm<sup>2</sup> et les résultats de l'expérience montrent une résolution en temps de 19 ps. Le fait que le circuit peut être synthetisé est démontré par une réalisation FPGA. De plus, un autre CI a été conçu et fabriqué dans une technologie CMOS 0.35 µm. Celleci permet la caractérisation en fréquence des données obtenues à partir de la LDV. Ce deuxième CI occupe une surface de 1.83 mm<sup>2</sup>. Comme le temps éxigé par des tests de production est une considération importante, une extension à cette architecture LDV est

présentée afin de réduire le temps de test au prix d'une augmentation de la surface du silicon. Enfin, une méthode est présentée pour obtenir les caractéristiques frequentielles du jitter en utilisant le LDV.

# **Acknowledgments**

It has been a long process and sometimes tedious and frustrated but mostly fun, exciting and rewarding. I can say it is not strict forward to come to this point. Without the support from various people and groups, it could have been much harder. Therefore, I would like to acknowledge them for their support during this process.

First, I would like to acknowledge the following organizations for their support of this research: Natural Sciences and Engineering Research Council of Canada, the Canadian Microelectronics Corporation and Micronet, a Canadian network of centers of excellence dealing with microelectronics devices, circuits, and systems.

I would also like to thank my supervisor Professor Gordon Roberts for his patient guidance both technically and personally throughout my degree. His enthusiasm and dedication in research is what brings the fun and exciting part of research.

I would like to thank the MACS group and the ECE system administrators for their support in the various aspect of research as well as their presence to make this a very fun group. In particular, I would like to acknowledge Bardia for his help in the French translation of my abstract.

Finally, I would like to thank my family for their personal and emotional support throughout the process. Without their support, things would have messed up midway through. The final important person that I would like to thank is someone who has been very patient with me for the past years. I would not have come to this point without her beside me. My special dedication to MMM.

# **Table of Contents**

| 1.1 - Motivation1                                                           |
|-----------------------------------------------------------------------------|
| 1.2 - Thesis Outline                                                        |
| 2.1 - Introduction                                                          |
| 2.2 - Jitter Statistics and Characterization                                |
| 2.3 - Unit Gate Delay Resolution8                                           |
| 2.4 - Sub-Gate Delay Resolution Circuit                                     |
| 2.5 - Drawbacks                                                             |
| 2.6 - Summary                                                               |
| 3.1 - Introduction                                                          |
| 3.2 - A Component-Invariant VDL Structure                                   |
| 3.3 - Circuit Implementation                                                |
| 3.3.1 - Edge Detector                                                       |
| 3.3.2 - Triggered Ring Oscillator Circuits                                  |
| 3.3.3 - Phase Detector                                                      |
| 3.3.4 - Complete Circuit                                                    |
| 3.4 - Calibration And Measurement Processes                                 |
| 3.5 - Test Time Comparison                                                  |
| 3.6 - Reducing Test Time Using A Spatial Arrangement of Component-Invariant |
| VDLs                                                                        |
| 3.7 - Experimental Results                                                  |
| 3.7.1 - CMOS Implementation of A Component-Invariant VDL31                  |
| 3.7.2 - Calibration for systematic error                                    |
| 3.7.3 - A Three-Oscillator Component-Invariant VDL in a FPGA                |
| 3.8 - Limiting Factors                                                      |
| 3.8.1 - Metastability in the D-Latches54                                    |
| 3.8.2 - Resolution limited by noise55                                       |
| 3.8.3 - Noise free clock                                                    |
| 3.9 - Summary                                                               |

| 4.1 - Introduction                           | 57 |
|----------------------------------------------|----|
| 4.2 - Frequency Characterization Method      | 57 |
| 4.3 - Experimental Result Using VDL Circuit  | 59 |
| 4.4 - Exploiting Hardware Requirement        | 61 |
| 4.4.1 - Discrete Fourier Transform Algorithm | 61 |
| 4.4.2 - Circuit Implementation               | 62 |
| 4.4.3 - Experimental Result from DFT IC      | 66 |
| 4.5 - Limiting Factors                       | 69 |
| 4.5.1 - Effective Sampling Rate              | 69 |
| 4.5.2 - Silicon Sizing                       | 70 |
| 4.6 - Summary                                | 70 |
| 5.1 - Conclusion                             | 71 |
| 5.2 - Future Works                           | 72 |

# **List of Figures**

| E' 11.0 CE 1                                                                          |
|---------------------------------------------------------------------------------------|
| Figure 1.1 SoC Example 2                                                              |
| Figure 1.2 SoC with TMU                                                               |
| Figure 1.3 TMU Architecture3                                                          |
| Figure 2.1 Sampling instances of signal6                                              |
| Figure 2.2 Jitter CDF and PDF7                                                        |
| Figure 2.3 VDL with unit gate resolution implementation9                              |
| Figure 2.4 VDL with sub-gate timing resolution                                        |
| Figure 2.5 VDL in 0.35mm CMOS technology                                              |
| Figure 2.6 VDL with mismatched elements                                               |
| Figure 3.1 Obtaining a timing measurement directly from a VDL16                       |
| Figure 3.2 Replacing VDL by a component-invariant VDL structure                       |
| Figure 3.3 Edge detector                                                              |
| Figure 3.4 Triggered ring oscillator                                                  |
| Figure 3.5 Voltage-controlled delay circuit                                           |
| Figure 3.6 Phase detector21                                                           |
| Figure 3.7 The circuit implementation of the component invariant VDL structure22      |
| Figure 3.8 Timing diagram for the calibration and measurement modes of operation23    |
| Figure 3.9 An array of component-invariant VDLs29                                     |
| Figure 3.10 Timing relationship for VDL array30                                       |
| Figure 3.11 Controller for VDL array structure                                        |
| Figure 3.12 A chip monograph of component-invariant VDL implemented in a $0.18~\mu m$ |
| CMOS process                                                                          |
| Figure 3.13 Experimental setup                                                        |
| Figure 3.14 Teradyne Tester                                                           |
| Figure 3.15 Four-layer PCB                                                            |
| Figure 3.16 Phase measurement of VDL set to the 54.5 ps timing resolution36           |
| Figure 3.17 Histograms for Gaussian distributed jitter for 54.5 ps                    |
| timing resolution                                                                     |

| timing resolution from                                                                                  | 40   |
|---------------------------------------------------------------------------------------------------------|------|
| Figure 3.19 Phase measurement of the VDL set to a 18.9 ps timing resolution                             | on41 |
| Figure 3.20 Histograms for Gaussian distributed jitter for 18.9 ps timing resolution                    | 42   |
| Figure 3.21 Histograms for sinusoidal distributed jitter for 18.9 ps timing resolution                  | 43   |
| Figure 3.22 Mean Value                                                                                  | 45   |
| Figure 3.23 RMS measurement before calibration                                                          | 45   |
| Figure 3.24 RMS measurement after calibration                                                           | 46   |
| Figure 3.25 Difference in RMS measurement before calibration                                            | 46   |
| Figure 3.26 Difference in RMS measurement after calibration                                             | 47   |
| Figure 3.27 Peak to Peak measurement before calibration                                                 | 47   |
| Figure 3.28 Peak to Peak measurement before calibration                                                 | 48   |
| Figure 3.29 Difference in Peak to Peak measurement before calibration                                   | 48   |
| Figure 3.30 Difference in Peak to Peak measurement after calibration                                    | 49   |
| Figure 3.31 Altera FPGA Board                                                                           | 50   |
| Figure 3.32 Test setup                                                                                  | 51   |
| Figure 3.33 Histograms for 0.566 ns VDL                                                                 | 52   |
| Figure 3.34 Histograms for 1.22 ns VDL                                                                  |      |
| Figure 3.35 Delay versus power supply                                                                   | 56   |
| Figure 4.1 Phase modulation by noise                                                                    | 58   |
| Figure 4.2 Effect of various sampling rates                                                             | 59   |
| Figure 4.3 A comparison of the frequency spectrum of a sinusoidal distribution before and after the VDL |      |
| Figure 4.4 Main Blocks                                                                                  | 62   |
| Figure 4.5 RAM Block                                                                                    | 63   |
| Figure 4.6 Mult/Add Block                                                                               | 64   |
| Figure 4.7 FFT Chip Micrograph                                                                          | 66   |
| Figure 4.8 DFT algorithm from MATLAB                                                                    | 68   |
| Figure 4.9 DFT algorithm from chip                                                                      | 68   |

# **Chapter 1 - Introduction**

#### 1.1 - Motivation

Timing accuracy is one of the required criteria for high-speed data communication System-on-Chip (SoC) which operates at gigahertz data rate. However, signal integrity is often degraded due to noise coming from the system itself and other external sources. These noise corruptions introduce reliability uncertainty to the timing accuracy which is often referred to as timing jitter. As such, certain measures of jitter are to be used to characterize the system to ensure performance reliability. Although some existing instrumentation provides such characterization, an on-chip measurement technique is often preferred due to reduction in signal path from Device Under Test (DUT) to the measurement device which provides more accurate measurement. Moreover, an on-chip measurement technique often has advantage over external instrumentation due to the inherit accessibility limitation to the DUT which is often an integrated part of the SoC. This is very common in the case of jitter characterization of Phase Locked Loops (PLLs) within an SoC as shown in Figure 1.1. In order to characterize the PLL within each Intellectual Property (IP) block of the SoC, an integrated test approach is required as shown in Figure 1.2. Here a timing measurement unit (TMU) consisting of a jitter excitation circuit (JEC), a jitter analyzing circuit (JAC) and a synchronization circuit (SC) as shown in Figure 1.3 is added to each IP block to provide testability for the DUT or PLL in this case. The research focus for this thesis is on the design of the JAC.



Figure 1.1 SoC Example



Figure 1.2 SoC with TMU



It should be noted that the JAC has to fulfil some important criteria:

- (1) Since the DUT runs at high speed, the amount of jitter that the DUT can tolerate has to be small. In order to provide accurate measurement, timing resolution of the JAC is one of the key requirements. For gigabit data rate, the required timing resolution will be in the order of picoseconds.
- (2) The JAC has to occupy a relatively small area of silicon since it is designed to be inserted in each IP block of the SoC.
- (3) Since minimizing test time is an important requirement for cost-effective design, the JAC has to provide the shortest possible test time.
- (4) Due to scalability and turn-over rate of digital circuits, the design should be synthesizable.

In recent years, researchers have devised various schemes in which to perform onchip timing measurements. Time-to-Digital Converter (TDC) using a Delay Locked Loop (DLL) [1], Vernier Delay Line (VDL) [2] and multiple phase ring oscillator [3] [4] are methods used to provide high-resolution on-chip timing measurements. For jitter measurement in particular, an on-chip circuit consisting of a ring oscillator and a calibration circuit was reported to be able to perform jitter measurements with a resolution as low as a single gate delay [5]. Moreover, the circuit was fully synthesizable from an

RTL description, as the design does not depend on matched elements. In fact, due to the scalability of digital circuits and turn-over rate of synthesizable circuits, the design in [5] is more preferable. A significant improvement to sub-gate resolution [6] jitter measurement device was also reported using a VDL. In this case, the timing resolution was derived from the difference of two gate delays. Unfortunately, the design depends largely on the matching of pairs of delay elements [7] and due to the nature of the design, it occupies relatively large silicon area. In order to remove the dependency on element matching while maintaining synthesizability, a component-invariant VDL structure based on a single-stage VDL is proposed in this research in order to tackle the design requirements described previously.

#### 1.2 - Thesis Outline

A description of the thesis is as follows. In Chapter 2, we shall describe the fundamentals of jitter statistics and the motivation from conceptual design to some existing designs and the evolvement from existing design to this novel design. In Chapter 3, we shall describe the theory of operation of the component-invariant VDL, the circuit details used to perform RMS and Peak-to-Peak jitter measurement as well as some experimental results obtain from an FPGA and IC implementations. In Chapter 4, we shall describe the technique for frequency characterization of jitter using additional circuit as well as some experimental result from this IC. Finally, we conclude the thesis in Chapter 5.

# Chapter 2 - Background

## 2.1 - Introduction

The principles of a basic jitter measurement are described in this chapter. This includes various statistical concepts applied to jitter, as well as a practical method for characterizing jitter. Various circuits used to measure jitter will be described, highlighting their advantages as well as their disadvantages.

## 2.2 - Jitter Statistics and Characterization

To illustrate the basic principle of a jitter measurement, consider uniformly sampling a noiseless clock signal with, say for sake of explanation, 16 points per period. Next, stack consecutive periods of this sampled clock signal one above the other as shown in Figure 2.1(a), taking care to line up each sampling point relative to the start of each clock cycle. As expected for a noiseless clock signal, in all three cases shown, the transition from logical low to logical high occurs at the same point, somewhere between sampling instances 8 and 9. However, if the clock signal is jittery, as shown in Figure 2.1(b), the rising edge of the clock signal will vary over the period of the clock. Statistically, we can characterize this variation by computing the Cumulative Density Function (CDF) of the jittery clock signal. This is simply done by counting the number of times the clock signal is logically high at each sampling instance. For the 3-period example given in Figure 2.1(b), at the first sampling instance the count is 0. The count

remains at 0 for the next three sampling instances then increases to 1 at the 6th sampling instance, remains at 1 until the 9th sampling instance where it increases to 2. The count remains at 2 until the 11th sampling instance and then increases to 3 at the 12th sampling instance. The count remains at 3 for the remaining four sampling instances. Although the granularity of this particular example is rather course, a smoother positive-increasing monotonic function would result if a larger number of clock periods are used in the calculation.



Figure 2.1 Sampling instances of (a) jitter-free signal (b) jittery signal



Figure 2.2 Jitter CDF and PDF

The resulting CDF would then take on a shape similar to that shown in Figure 2.2(a). Interesting enough, if the derivative of the CDF is computed, then one would obtain the probability density function (PDF) or histogram of the jittery clock signal as shown in

Figure 2.2(b) [8]. Subsequently, we can extract familiar statistical measures of the jitter, such as the RMS and Peak-to-Peak values, based on these results.

## 2.3 - Unit Gate Delay Resolution

The algorithm described above can be implemented using the circuit shown in Figure 2.3(a) consisting of a series of N D-type flip-flops or D-latches, counters and a delay line with N taps. The signal under test, herein referred to as the data signal, is simultaneously applied to the input of each D-latch whose clock inputs are slightly delay with respect to one another to correspond to the appropriate sampling instance described in the previous sub-section. The clock signals for the various D-latch are generated by passing a master clock signal through a chain of buffers, delaying the clock by  $\tau$  second as is shown Figure 2.3(b). Assuming that the master clock signal is jitter-free, the size of the delay  $\tau$  establishes the timing resolution of the measurement. If the data signal is leading a particular clock signal, the output of the corresponding flip-flop will go high and cause the counter connected to its output to increment its content by 1. However, if the data signal is lagging the clock signal, the flip-flop will latch to a logic low value and the counter will maintain its current value. For example, consider the timing relationship between Data, Clock\_1, Clock\_2 and Clock\_3 shown in Figure 2.3(b). In this case, the data signal is sampled by three delayed version of the same clock signal simultaneously. Counter\_1 keeps its current value while Counter\_2 and Counter\_3 increment their contents by 1. This process is repeated for a large number of clock cycles. The values stored in the counter is then processed to obtained a CDF and, in turn, a PDF [8].



Figure 2.3 VDL with unit gate resolution implementation (a) circuit diagram (b) timing diagram

## 2.4 - Sub-Gate Delay Resolution Circuit

It should be obvious that a smaller unit delay in the delay chain will result in a finer timing resolution in which to collect the signal statistics. However, the smallest gate delay available is technology dependent, as summarized in Table 1.

Table. 1 - Intrinsic delay of different CMOS technology

| Technology | Buffer Delay |
|------------|--------------|
| 0.35 μm    | 52.3 ps      |
| 0.18 μm    | 25.1 ps      |

To circumvent this limitation, an additional delay chain can be added to the data path as shown in Figure 2.4(a). The resolution is no longer dependent on the intrinsic gate delay, but rather the delay difference between two gates. If the gates delays are made slightly different, sub-gate timing resolution is possible. Such a structure has come to be known as a VDL. Here it is assumed that the clock signal is jitter-free. The symbols  $\tau_f$  and  $\tau_s$  are the respective propagation delays of the buffers interconnecting each stage of the VDL. As the propagation delays of the clock and data paths differ by an amount  $\Delta t = \tau_s - \tau_f$ , the time difference between the rising edges of the data and clock signals will decrease by  $\Delta t$  after each stage of the VDL. The phase relationship between these two rising edges after each stage is detected and recorded by a corresponding D-latch. A logical low will result when the clock signal leads the data signal, whereas a logical high will result when the data signal leads the clock signal. The output of the D-latch is passed to a counter circuit, which simply counts the number of times the data signal leads the clock signal (i.e., the number of logical 1's) with a delay difference set by its position in the VDL.



Figure 2.4 VDL with sub-gate timing resolution (a) circuit diagram (b) timing diagram

By design, the data signal will be made to always lead the clock signal at the input of the VDL by placing an additional delay block in series with the clock input. Subsequently, as the clock input signal propagates along the VDL, we come to a point

where the data signal will go from a leading to lagging situation that is detectable. This will result in all the D-latches prior to this point to register logical high levels, whereas all D-latches after this point will register logical low levels. The counter in each stage then registers the corresponding state of each D-latch. As the phase between the data and clock signals at the input of the VDL is a random variable, each time the measurement is performed, a different set of D-latches are set to a logical high level and the corresponding counters begin to register different values. In the case of the first counter, its count value reflects the number of times the rising edge of the data signal is ahead of the rising edge of the clock signal with a delay greater than  $\Delta t$ . Likewise, the counter in the next stage will correspond to the number of times the rising edge of the data signal leads the rising edge of the clock signal with a delay greater than 2Δt. Subsequently, the following stages correspond to the number of times the data signal leads the clock signal by  $3\Delta t$ ,  $4\Delta t$ , and so forth. As an example, the timing relationship between the data and clock signal is shown in Figure 2.4(b). In this case, three delayed version of the data signal is sampled simultaneously by three delayed version of the clock signal, resulting in sub-gate timing resolution. Statistically, the various count values can be collected and used to create a CDF of the jitter riding on the data signal. Subsequently, a PDF can then be obtained.

## 2.5 - Drawbacks

To see the drawback of this design, let us take a look at the design implemented previously in [6]. This design was implemented in a 0.35 µm CMOS technology consisting of 101 buffer stages in the VDL. The chip micrograph taken from [6] is shown in Figure 2.5 The respective sizes of the VDL and the counter is summarized in Table 2. An important drawback of this design is large number of counters that is required. Not only is size a drawback to this design, the measurement accuracy of this VDL structure is dependent on the matching of delay elements in the various stages, as symbolically illustrated in Figure 2.6. These mismatch errors lead to differential non-linearity timing errors. Although careful layout techniques can help to minimize these mismatches, it cannot eliminate them completely.



Figure 2.5 VDL in 0.35μm CMOS technology

Table. 2 - Silicon area for VDL in CMOS 0.35  $\mu\text{m}$ 

| Component | Area occupied       |
|-----------|---------------------|
| VDL       | 2.4 mm <sup>2</sup> |
| Counter   | 3.1 mm <sup>2</sup> |



Figure 2.6 VDL with mismatched elements

# 2.6 - Summary

Some basic jitter statistics and method of characterization as well as some existing circuits and their drawbacks have been discussed in this chapter. It is clear that a new technique is required to overcome these problems. Chapter 3 will introduce the concept of a component-invariant VDL circuit that gets around both of these problems.

# Chapter 3 - Jitter Characterization for Time Domain Parameter

#### 3.1 - Introduction

In the previous chapter, the problems with the existing state-of-the-art in jitter measurements was described. In particular, the component matching requirement in the VDL and the excessive number of counters required. In this chapter we shall outline a new VDL circuit that is component-invariant and requires a single counter. Implementation details in a 0.18  $\mu$ m CMOS process from TSMC will be described, together with experimental results. In addition, an FPGA implementation will also be provided illustrating the synthesizable nature of the technique. Finally, we will look at some of the design limitations such as latch metastability issues and clock and power supply noise.

## 3.2 - A Component-Invariant VDL Structure

If one assumes that the period of the data and clock signal, denoted as T, is larger than the total propagation delay through an M-stage VDL, then the outputs of all the D-latches can be combined into one bit-stream whose total count of logical high levels represents the actual time difference between the edge of the data and clock signal taken at a particular instant in time. This is easily achieved by ORing the outputs of all the D-latches together and counting the number of logical high levels over the time period T, as

shown in Figure 3.1. Repeating the measurement N times enables a histogram of the jitter signal to be constructed. This eliminates the problem of having to use a large number of counters.



Figure 3.1 Obtaining a timing measurement directly from a VDL

As noted previously in section 2.5, the original VDL design is sensitive to component mismatches in the delay elements. However, by utilizing the same delay elements in each stage, mismatches can be completely eliminated. Such an approach is achieved by modifying the circuit in Figure 3.1 to obtain the component-invariant VDL structure shown in Figure 3.2 [9][10]. This novel circuit is the main focus of this research and will be discussed in some detail in the next subsection. In this circuit, inverters instead of buffers are used to create the delay difference between the data and clock input signals to the D-latches. In addition, the output of each inverter is fed back to its corresponding input, depending on the state of the switch in its feedback path. When the switches are closed, the inverters are configured with regenerative feedback, hence will oscillate with a period of  $2\tau_s$  or  $2\tau_f$  seconds, depending on the propagation delay of the inverter in the feedback loop. More importantly, each inverter circuit will delay the leading edge of the data signal with respect to the leading edge of the clock signal by an amount  $2(\tau_s - \tau_f)$  or  $2\Delta \tau$  seconds every cycle of the input clock signal. Alternatively, one can envision this to

be equivalent to having two ring oscillators running simultaneously with different frequencies to produce a constant delay difference during every cycle of oscillation.

To ensure an accurate time measurement, switch A in the feedback path of the inverter controlling the clock of the D-latch shown in Figure 3.2 must be closed on the rising edge of the clock signal, whereas switch B must be closed on the rising edge of the data signal. Simultaneously, the output of the single D-latch is passed to a counter, which simply determines how many clock cycles the D-latch remains in the high state. Once the edge of the data signal goes from a leading to lagging situation with respect to the clock signal (or vice versa), the counter is stopped. The time difference between the data and clock edges can then be computed. The process is then repeated and a histogram of the jitter is derived.



Figure 3.2 Replacing VDL by a component-invariant VDL structure

# 3.3 - Circuit Implementation

In this section we shall describe the circuit details of the three main circuit components of the component-invariant VDL circuit: the edge detector, the ring oscillator and the phase detector. We shall also outline the calibration and measurement process, and the time it takes to complete a single measurement.

#### 3.3.1 - Edge Detector

The main function of the edge detector is to catch the rising edge of the data or clock signal. This can be implemented using a single D-type flip-flop with the D and reset inputs connected together as shown in Figure 3.3(a). With the enable signal set high, the D-type flip-flop output will latch its output high on the subsequent rising edge of the clock or data signal. The D-latch will be reset when the enable input is set low and another clock or data edge occurs. These two situations are illustrated in Figure 3.3(b).





Figure 3.3 Edge detector (a) circuit diagram (b) timing diagram

#### 3.3.2 - Triggered Ring Oscillator Circuits

At the heart of the component-invariant VDL are the two triggered ring-oscillator circuits highlighted in Figure 3.2. Note that  $\tau_f$  and  $\tau_s$  are the respective propagation delays around the loop of each ring oscillator. In order to maintain a predictable phase relationship for detection,  $\tau_s$  is set to be greater than  $\tau_f$ . (Here the subscript 's' indicates a slow oscillation and 'f' for a fast oscillation.) This, in turn, establishes the ring oscillator triggered by the clock signal to run at a higher frequency than the oscillator triggered by the data signal. To trigger each oscillator on a logical high level, the circuit of Figure 3.4 was chosen. The delay in the ring oscillator circuit can be designed to be tunable using the voltage-controlled delay cell [7] shown in Figure 3.5 or as a fixed-delay buffer using several inverter circuits in cascade.



Figure 3.4 Triggered ring oscillator (a) circuit diagram (b) timing diagram



Figure 3.5 Voltage-controlled delay circuit

#### 3.3.3 - Phase Detector

The phase detector circuit is used to keep track of the history of the phase difference between the two oscillators, thus providing information on a phase change. As mentioned in the previous section, by design, the edge of the data signal is always set to lead the clock signal at the start of the measurement process. It is when the data signal begins to lag the clock signal that the measurement process is to stop. To accomplish this, the phase detector is implemented using the two D-latches shown in Figure 3.6(a). The output of the AND gate will switch from a logical low level to logical high level when an input sequence of '10' is detected as shown in Figure 3.6(b).



Figure 3.6 Phase detector (a) circuit diagram (b) timing diagram

#### 3.3.4 - Complete Circuit

By combining the edge and phase detector, the two triggered oscillators and the counter circuit, the complete component-invariant VDL circuit is shown in Figure 3.7. It is interesting to note that this circuit can be synthesized using entirely digital logic.



Figure 3.7 The circuit implementation of the component invariant VDL structure

#### 3.4 - Calibration And Measurement Processes

There is an intrinsic delay difference between the signal path of the clock and the data lines, which includes the intentional delay added between the clock-triggered oscillator and the edge detector, the setup time and propagation delay difference between the D-latches in the two edge detectors, as well as that of the XOR gate in the ring oscillators. Since all these delays are process sensitive, the measured delay will be different from the actual delay difference between the clock and the data edges. Also, note that the difference in oscillation frequencies determines the measurement resolution, which also becomes process sensitive due to the unpredictable delay of each loop in the ring oscillator. Therefore, in order to make the design fully synthesizable, i.e. no element matching is required, a calibration sequence is necessary to determine the frequency of oscillations and the difference between the delay path of the data and clock signal.

To achieve a calibration, the clock and data lines are first connected together to determine the delay difference between the two signal paths. Because the two inputs are tied together, jitter on the input calibration signal will have little effect. As the enable signal goes high, the two oscillators start to oscillate and the counter starts to record the number of clock cycles that the rising edge of the clock signal is leading the data signal.

Once the clock signal lags the data signal, the counter will have counted to a value of  $N_0$  corresponding to a time delay of  $T_0$  as shown in Figure 3.8.



Figure 3.8 Timing diagram for the calibration and measurement modes of operation

After a certain period of time  $T_p$ , the rising edge of the clock-triggered oscillator will move across one complete cycle of the data-triggered oscillator, corresponding to  $N_p$  counts as also shown in Figure 3.8. The timing resolution,  $\Delta T$  is defined as

$$\Delta T = T_s - T_f \tag{3.1}$$

where  $T_s$  is the oscillation period of the data-triggered oscillator and  $T_f$  is the oscillation period of the clock-triggered oscillator.  $T_f$  is related to  $T_p$  and  $N_p$  according to

$$T_f = \frac{T_p}{N_p} \tag{3.2}$$

As the clock-triggered oscillator completes  $N_p$  cycles in the time  $T_p$ , the data-triggered oscillator must complete  $N_p$  -1 cycles. Hence, the fundamental relationship that governs the coherence of this design is

$$T_p = N_p \cdot T_f = (N_p - 1) \cdot T_s$$
 (3.3)

We shall refer to  $T_p$  as the coherency period. By rearranging (3.3), the period of the data-triggered oscillator can be then determined in terms of the measured data as follows

$$T_s = \frac{T_p}{N_p - 1} \tag{3.4}$$

Finally, the timing resolution  $\Delta T$  can be determined by substituting (3.2) and (3.4) into (3.1), to obtain

$$\Delta T = \frac{T_p}{N_p \cdot (N_p - 1)} \tag{3.5}$$

The time value of  $T_p$  is usually very large compared to  $T_f$ , thus, depending on measurement equipment, measuring an accurate  $T_p$  may be difficult, especially in the case of a small time step over a large measurement range. An alternative approach is to measure  $T_f$  indirectly through the counter output. As shown previously in Figure 3.7, the counter is used to count the number of the clock-triggered oscillator cycles during calibration and during a phase measurement. Therefore, when the oscillator is running,  $T_f$  can be obtained by measuring the cycling time of one bit of the counter. Thus,  $T_f$  is related as follows:

$$T_f = \left(\frac{1}{2}\right)^n \cdot T_c \tag{3.6}$$

where n is the bit position with respect to the least significant bit of the counter and  $T_c$  is the cycling time of the nth counter bit. Therefore,  $T_p$  can be determined from (3.3) as follows

$$T_p = T_f \cdot N_p = \left(\frac{1}{2}\right)^n \cdot T_c \cdot N_p \tag{3.7}$$

Of course, once  $T_p$  is known,  $T_s$  can be calculated using (3.4).

Now that the calibration phase is complete, we are ready to perform a time measurement. Assuming that the calibration and measurement modes experience the same

intrinsic delay along the clock and data paths, we can compute the measured phase difference by subtracting off the intrinsic delay  $T_0$ , according to

$$T_o = \Delta T \cdot N_o \tag{3.8}$$

Similarly, denoting the counter output during the measurement mode as N (shown in Figure 3.8), the total measured time  $T_n$  is given by

$$T_n = \Delta T \cdot N \tag{3.9}$$

Since this time includes the intrinsic delay T<sub>0</sub>, the actual time difference between the data and clock signal is given by

$$T_m = T_n - T_o \tag{3.10}$$

Substituting (3.8) and (3.9) into (3.10),  $T_{\rm m}$  can be written more directly in terms of the measured parameters as follows

$$T_m = \Delta T \cdot (N - N_o) \tag{3.11}$$

The accuracy of our measured time interval  $T_m$  is directly dependent on the accuracy of the calibration factors,  $T_p$  and the counts,  $N_o$  and  $N_p$ . To a first-order approximation and assuming  $N_p - 1 \approx N_p$ , the relative accuracy of  $T_m$  can be expressed as

$$\frac{\Delta T_m}{T_m} \approx \frac{\Delta T_p}{T_p} - 2\frac{\Delta N_p}{N_p} - \frac{\Delta N_o}{N - N_o}$$
(3.12)

With count errors of  $\pm 1$  count, the accuracy of the time measurement becomes

$$\frac{\Delta T_m}{T_m} \approx \frac{\Delta T_p}{T_p} \mp 2\frac{1}{N_p} - \frac{1}{N - N_o}$$
 (3.13)

Here we clearly see that  $N_p$ ,  $N-N_o$ , and  $T_p$  should be as large as possible in all situations to maximize the relative accuracy of the measurement. Of course,  $N-N_o$  is outside of our direct control, as it varies with each measurement. However, as evident from (3.13), in order for  $T_p$  and  $N_p$  to be maximized, the periods of both the data and clock triggered oscillators (i.e.,  $T_f$  and  $T_s$ ) must be maximized. In general, to increase the oscillation

period of a ring oscillator requires additional delay stages to be added in the feedback loop of the oscillator. However, incorporating additional delay stages increase the noise introduced into the circuit and adds to the randomness in any particular measurement. Nonetheless, measurement averaging can be exploited to reduce this variation at the expense of more test time.

# 3.5 - Test Time Comparison

Since test time is one of the criteria to quantify the performance of a measurement device, the required test time of the component-invariant VDL is compared with the original VDL circuit shown in Figure 2.4. For the VDL circuit of Figure 2.4, the test time to collect all the CDF data is roughly equal to

$$T_{test} \approx T_{clk} \cdot N_{sample}$$
 (3.14)

where  $T_{test}$  is the total test time,  $T_{clk}$  is the clock period and  $N_{sample}$  is the total number of samples taken. For clock period of  $T_{clk} = 1$  ns and 5000 samples collected, the total test time would be approximately  $T_{test} \approx 5 \mu s$ .

For the component-invariant VDL, assuming jitter is uncorrelated with the clock signal, the average test time per sample can be estimated by taking the mean of the maximum and minimum test time per sample. Test time per sample is at a maximum when the clock and data signal is different by almost one clock cycle  $T_f$ . This is determined during the calibration phase to be equal to  $T_p$ . Test time per sample is at a minimum when the data and clock signal is aligned such that it only requires one oscillation cycle to obtain a phase change. Hence, the maximum test time per sample is estimated to be

$$T_{max-test-per-sample} \approx T_p \tag{3.15}$$

Therefore, the average test time per sample is

$$T_{test-per-sample} \approx \frac{T_p}{2}$$
 (3.16)

and the average overall test time becomes

$$T_{test} \approx \frac{T_p}{2} \cdot N_{sample} \tag{3.17}$$

where N<sub>sample</sub> is the total number of samples collected.

With 5000 samples collected, a timing resolution of 10 ps, a result of clock and data oscillators having periods of 0.2 ns and 0.21 ns, respectively, and a coherency time  $T_p$  of 4.2 ns, we find from above that the average test time is expected to be  $T_{test} \approx 10 \mu s$ .

The component-invariant approach typically requires longer test times than the original VDL approach of Figure 2.4. One way to reduce the test time is through the application of additional component-invariant VDL stages.

# 3.6 - Reducing Test Time Using A Spatial Arrangement of Component-Invariant VDLs

An array of component-invariant VDLs can be configured as shown in Figure 3.9. Here a single clock-triggered ring oscillator is shown driving the clock input of each D flip-flop. All the data-triggered ring oscillators are designed to have the same nominal oscillation frequency but all are triggered by a progressively increasing one-gate delayed data signal. With the data-trigger oscillation frequency set below the clock-triggered oscillation frequency, a time-grid of data-triggered oscillation will result as shown in Figure 3.10. Therefore, as soon as the rising edge of the clock-triggered oscillator passes through any one of the rising edges of the data-triggered oscillators, the result will be known. For jitter measurement applications, this structure has the advantage that the measurement time is significantly reduced. Since jitter is assumed to be random and, hence, does not correlate with the time that the sample is taken, a non-uniform sampling of data will also lead to a good estimation of the jitter statistics.

Note that the phase difference between any data-triggered oscillator does not have to be matched, since calibration can be performed on each component-invariant VDL circuit. For the same reasons, the frequency of oscillations of these data-triggered oscillators does not have to be made exactly equal.

Since there are multiple phase detectors, a controller will be required to select the earliest phase detection. This can be easily implemented using some simple combinational logic such as that shown in Figure 3.11. The calibration process is exactly the same as the single component-invariant structure, provided one calibrates each data-triggered oscillator with respect to the clock-triggered oscillator. During calibration mode, the control signal  $C_i$  will be set to a logical high level to enable the  $i^{th}$  data-triggered oscillator, and all other  $C_j$ 's  $(i \neq j)$  will be set to a logical low level to disable the other data-triggered oscillators. During measurement mode, all C's will be set to a logical high level.



Figure 3.9 An array of component-invariant VDLs



Figure 3.10 Timing relationship for VDL array



Figure 3.11 Controller for VDL array structure

Since the efficiency of the time reduction depends on the time grid location, if P component-invariant VDLs are added to provide the optimal time-grid, the average test time per sample is now reduced to

$$T_{test-per-sample} \approx \frac{T_p}{2 \cdot P}$$
 (3.18)

Therefore, the total test time becomes

$$T_{test} \approx \frac{T_p}{2 \cdot P} \cdot N_{sample} \tag{3.19}$$

where N<sub>sample</sub> is the number of samples collected over the test period.

For a VDL with 4 additional oscillators (each with an oscillation period of  $T_f$  = 0.21 ns), 5000 samples collected over the test time and a timing resolution of  $\Delta T$  = 10 ps, the expected test time becomes 2.5  $\mu s$  (compared to 10  $\mu s$  in the single oscillator VDL case).

Note that an OR gate with a large number of inputs is necessary if a large number of data-triggered oscillators are used. Practically speaking, however, only a few are required to produce a "time grid" fine enough to reduce the test time significantly. In addition, circuitry for decoding which data-trigger oscillator is responsible for the phase detect must also be added. This is necessary to know which calibration factors should be used to compute the appropriate measurement time. Latching the state of the outputs of each phase detect at the time of the first phase detect will enable the user to know which oscillator is taking part in the time measurement.

# 3.7 - Experimental Results

#### 3.7.1 - CMOS Implementation of A Component-Invariant VDL

In this section we shall describe the experimental results of a single-stage component-invariant VDL implemented in a 0.18 µm CMOS process. The schematic of the circuit that was used is shown in Figure 3.7, with the addition of two 12-bit shift registers. These two registers shown in boxed area on dotted line were included for scanning digital data off the chip. The buffers in the feedback path of the two triggered oscillators was implemented using the voltage-controlled delay circuit shown in Figure 3.5 with the DC control node of each delay cell brought off the chip for external control. The layout of the experimental chip is shown in Figure 3.12. In the center of the chip monograph are four white boxes on solid lines containing four component-invariant VDLs. Each cell occupies an area of 0.12 mm<sup>2</sup>.



Figure 3.12 A chip monograph of component-invariant VDL implemented in a 0.18  $\mu$ m CMOS process. The boxed region on solid line illustrates one component-invariant VDL circuit consuming approximately 0.12 mm<sup>2</sup> of silicon area.

Several experiments were run using the experimental setup shown in Figure 3.13. This setup consists of a Teradyne A567 (shown in Figure 3.14) mixed-signal tester configured as a pseudo jittery clock stimulus and a Wavecrest DTS-2770 jitter analyzer for comparing the results of the VDL. The circuit was connected to the test-head through a Device Interface Board (DIB). A user computer was used to control the test sequence and test setup. A software program was written to coordinate these test procedures. Through a mathematical analysis in MATLAB, the edges of the jittery clock signal were assigned a specific statistical distribution and stored in the memory of the Teradyne A567 tester. Each number corresponds to an edge placement of a digital signal. These edge placements have

a maximum timing resolution of 78 ps [11]. The jittery clock signal was then applied simultaneously to the VDL and the Wavecrest instrument from which the actual phase error of the test signal relative to a reference clock signal could be measured. The Wavecrest instrument has 0.8 ps timing resolution,  $\pm 25$  ps single-shot accuracy and  $\pm 10$  ps average accuracy [12].



Figure 3.13 Experimental setup

Since high resolution timing measurement is very sensitive to noise, in order to have quality measurement, a four-layer PCB was designed and fabricated as shown in Figure 3.15. The four layers were designed to be analog ground, digital ground and two signal layers. The SMA connector on the PCB were placed very closed to the signal to be measured since PCB trace can introduce relatively significant picosecond timing delay. The same reason applies to the IC where it is placed closed to the digital pins as opposed to the analog pins. As well, both analog and digital power supplies were regulated by onboard voltage regulators to provide stable DC voltages. The DC control voltages which control the timing resolution of the two oscillators were filtered by passive RC low-pass filters to reduce high frequency noise as much as possible. All power supplies were also decoupled with capacitors to reduce high frequency noise. These passive filters and decoupling capacitors were located as close as possible to the IC on the PCB to reduce

noise picked up by the PCB trace. A clamp was used to keep the IC in place so that the IC did not have to be soldered on the PCB for ease of replacement.

With the DC control voltage of the two delay cells set to 0.68 V and 1.8 V, respectively, the VDL was calibrated using 2000 samples of a 1.56 MHz clock. Specifically,  $T_p$  was measured to be 345 ns,  $N_p$  was measured to be 80.1 counts and  $N_o$  was found to be 23.05 counts. Subsequently, according to (3.2) and (3.4), the oscillation frequency of the clock-triggered oscillator,  $T_f$ , was calculated to be 4.307 ns which corresponds to a frequency of 232 MHz and  $T_s$  was calculated to be 4.362 ns which corresponds to a frequency of 229 MHz. Therefore, according to (3.5), a timing resolution of 54.5 ps was achieved.



Figure 3.14 Teradyne Tester

The VDL circuit was then characterized by sweeping its input with a 1.56 MHz clock signal whose phase was linearly varied over its clock period. Simultaneously, the same signal was sampled by the Wavecrest DTS-2770 jitter analyzer. Five thousand

samples of the input signal was measured. The results are displayed in Figure 3.16(a) with the VDL results plotted as a function of the Wavecrest results. As is clearly evident, there is a strong linear correlation between the two results. In the phase range 0 - 1000 ps, the correlation is nearly perfect as evident from the expanded view shown in Figure 3.16(b). Above this level, the measurement correlation seems to decrease. We believe this deviation is due to excessive noise induced on the power supply by the triggered oscillator pair and the counter. Since the delay cells and the counter share a common power supply and that the ring oscillator pair draws different amount of current from the power supply depending on the relative position of the rising edges of the ring oscillator pair - the power supply level varies with the input signal conditions. Better layout and improved on-chip power supply decoupling is expected to improve this situation.



Figure 3.15 Four-layer PCB



Figure 3.16 Phase measurement of VDL set to the 54.5 ps timing resolution (a) full view (b) zoom view



Figure 3.17 Histograms for Gaussian distributed jitter for 54.5 ps timing resolution from (a) VDL (b) Wavecrest

The next experiment was one that involved generating a 1.56 MHz clock signal with a jitter component having gaussian statistics. The jitter variation was made to stay within the linear range of the VDL (i.e., < 1000 ps). The DC control voltages of the VDL circuit were adjusted to 0.68 V and 1.8 V, respectively, whereby a 54.5-ps timing resolution was achieved. Two thousand samples were simultaneously collected by the VDL and Wavecrest instrument. The captured histograms are shown in Figure 3.17(a) and Figure 3.17(b), respectively. The resolution is clearly higher with the Wavecrest instrument (finer line widths), however, the overall shape has very similar behavior. The RMS and peak-to-peak values of the two distributions are summarized in Table 3. The two sets of statistics show excellent correlation.

Table. 3 - Jitter measured using 54.5 ps timing resolution VDL

| Jitter<br>Distribution | Measurement<br>Type | VDL<br>(54.5 ps timing<br>resolution) | Wavecrest<br>(as reference) |
|------------------------|---------------------|---------------------------------------|-----------------------------|
| Gaussian               | RMS                 | 92.8 ps                               | 98.79 ps                    |
| Gaussian               | Peak to peak        | 599 ps                                | 634.8 ps                    |
| Sinusoidal             | RMS                 | 168.3 ps                              | 185.6 ps                    |
| Sinusoidal             | Peak to peak        | 707.9 ps                              | 606.6 ps                    |

Another experiment was run in a manner similar to that just described, however, this time a 1.56 MHz clock signal with a jitter component having a sinusoidal distribution was created and applied to the VDL and Wavecrest instrument simultaneously. The captured histograms are shown in Figure 3.18(a) and Figure 3.18(b). The RMS and peak-to-peak values of the two distributions are summarized in Table 3. As is evident, the two sets of statistics show very good correlation.

By adjusting the control voltages associated with the two delay blocks inside the VDL circuit, we were able to fine-tune the time resolution of the component-invariant VDL to 18.9 ps. The histogram results for both a Gaussian and sinusoidal jitter distribution for both the VDL and the Wavecrest instrument are shown in Figure 3.19, Figure 3.20 and Figure 3.21. Furthermore, Table 4 summarize the statistics of each these

experiments. Once again, we conclude that the results of the VDL correlate very well with the Wavecrest instrument.

Table. 4 - Jitter measured using 18.9 ps timing resolution VDL

| Jitter<br>Distribution | Measurement<br>Type | VDL<br>(18.9 ps timing<br>resolution) | Wavecrest<br>(as reference) |
|------------------------|---------------------|---------------------------------------|-----------------------------|
| Gaussian               | RMS                 | 21.1 ps                               | 18.68 ps                    |
| Gaussian               | Peak to peak        | 158.2 ps                              | 124.5 ps                    |
| Sinusoidal             | RMS                 | 32.5 ps                               | 41.94 ps                    |
| Sinusoidal             | Peak to peak        | 234.5 ps                              | 197.8 ps                    |



Figure 3.18 Histograms for sinusoidal distributed jitter for 54.5 ps timing resolutionfrom (a) VDL (b) Wavecrest





Figure 3.19 Phase measurement of the VDL set to a 18.9 ps timing resolution (a) full view (b) zoom view





Figure 3.20 Histograms for Gaussian distributed jitter for 18.9 ps timing resolution from (a) VDL (b) Wavecrest



Figure 3.21 Histograms for sinusoidal distributed jitter for 18.9 ps timing resolution from (a) VDL (b) Wavecrest

#### 3.7.2 - Calibration for systematic error

To exploit the limitation of the design, a set of pseudo random gaussian jitter was generated from the tester and the results were compared. The mean values of the measurement obtained by VDL and Wavecrest are shown in Figure 3.22. As expected, it shows relatively linear behaviour. However, the RMS jitter of the same measurement shown in Figure 3.23 shows some systematic deviations. This is due to the change in noise level for different phase input as discussed in the previous section. However, this systematic error can be corrected by having a lookup table that maps the RMS values obtained by the VDL to that of the Wavecrest. Therefore, the RMS values obtained by the VDL can be corrected in software to obtain the actual RMS value taken from a particular measurement. The corrected RMS values are shown in Figure 3.24. This lookup table optimizes the error across all the measurements. The difference between the measured values from the VDL and Wavecrest before and after calibration are shown in Figure 3.25 and Figure 3.26. The error after calibration shows a ±25 ps error which is within the timing resolution of 54.5 ps. However, better optimization can be obtained by localizing the measurement range and optimizing only the measurement range of interest.

The peak-to-peak jitter of the same measurement are shown in Figure 3.27. The corrected peak-to-peak values are shown in Figure 3.28. The difference between the measured values from the VDL and Wavecrest before and after calibration are shown in Figure 3.29 and Figure 3.30. The error after calibration shows a  $\pm 150$  ps error.





Figure 3.23 RMS measurement before calibration



Figure 3.24 RMS measurement after calibration



Figure 3.25 Difference in RMS measurement before calibration



Figure 3.26 Difference in RMS measurement after calibration



Figure 3.27 Peak to Peak measurement before calibration



Figure 3.28 Peak to Peak measurement before calibration



Figure 3.29 Difference in Peak to Peak measurement before calibration



Figure 3.30 Difference in Peak to Peak measurement after calibration

#### 3.7.3 - A Three-Oscillator Component-Invariant VDL in a FPGA

To demonstrate the synthesizable nature of the proposed design, in this section we shall implement a three-oscillator component-invariant VDL circuit using a FPGA as shown in Figure 3.31. This section will also demonstrate the test time saving of the multiple array approach.

A three-oscillator component-invariant VDL structure was implemented on an Altera FPGA as a first proof of concept of the synthesizable nature of the VDL design. The entire design was capable of fitting into a 128 macrocell FPGA. The circuit was calibrated by collecting 1500 samples, resulting in the following parameters:  $N_{p1}$  =143.2 counts,  $N_{o1}$  = 57.4 counts,  $N_{p2}$  = 66.1 counts,  $N_{o2}$  = 99.2 counts. In this design, the oscillation period of the clock-triggered oscillator was found to be 81.6 ns, corresponding to a frequency of 12.3 MHz, which is deduced from the cycling time of one bit of the counter. In this situation, the oscillation period of the clock is larger than that of the data. This is a result of our inability to precisely control the location of our cells. As a result, the

clock-triggered oscillator completes  $N_f$  cycles in the time  $T_p$  while the data-triggered oscillator completes  $N_f + 1$  cycles. Hence, the coherency equation becomes

$$T_p = N_p \cdot T_f = (N_p + 1) \cdot T_s$$
 (3.20)



Figure 3.31 Altera FPGA Board

This is slightly different than what we saw before in (3.3). Consequently, the formulas for  $T_s$  and  $\Delta T$  in (3.4) and (3.5) are slightly modified by replacing  $N_p$  - 1 by  $N_p$  + 1. Subsequently, the oscillation periods of the two data-triggered oscillators were calculated to be 81.03 ns and 80.38 ns, giving rise to timing resolution of 0.566 ns in one case and 1.22 ns in the other. These particular results are strongly dependent on the physical location of the macrocells in the FPGA. If one were to exercise greater control over the cell placement, then we would expect a higher timing resolution.

To test this circuit, we made use of a Teradyne A567 mixed-signal tester to generate a 2 MHz repetitive data signal with a jitter component having Gaussian statistics. The experimental setup is shown in Figure 3.32. Subsequently, the RMS and peak-to-peak values of the histogram are derived. The jittery signal was designed to have zero mean, an RMS value of 1.03 ns and an 8 ns peak-to-peak value. The theoretical histogram is shown in Figure 3.33(a). The component-invariant VDL with a 0.566 ns timing resolution was

then used to measure the characteristics of this signal with 1500 samples. The resulting histogram is displayed in Figure 3.33(b). Here the RMS value was found to be 1.27 ns and the peak-to-peak value was found to be 9.05 ns. In the case of the RMS value, the experimental error was 0.24 ns, which is within the timing resolution of the VDL, i.e., 0.566 ns.



Figure 3.32 Test setup

A second test was run, but this time we made use of the component-invariant VDL that had a 1.22 ns timing resolution. In this case, the jitter was designed to have an RMS value of 2.06 ns and a 16 ns peak-to-peak value. The histogram of the Jitter input is shown in Figure 3.34(a). Fifteen hundred samples of this jitter were then gather by the VDL and the results are shown in Figure 3.34(b). The measured distribution has an RMS value of 2.64 ns and a 19.8 ns peak-to-peak value. In the case of the RMS value, the experimental error was 0.58 ns which is again within the timing resolution of the VDL, i.e., 1.22 ns.

Deviations of the experimental results from the theoretical are speculated to be caused by the internal jitter of the tester and by the jitter generated by the VDL. Nonetheless, the measurements correlate very well with each other.

To illustrate the test time reduction that is possible when an array of VDL are utilized, we summarize in Table 5 the test time required for two VDLs tuned to a 0.5466 ns and 1.22 ns timing resolution, and when both are utilize during the time measurement. As is clearly evident in the case cited, when the two VDLs are combined, a significant reduction in test time is achieved. Since the efficiency of the time reduction depends on the time grid location of the VDLs, if one were to exercise greater control over the cell placement, then we would expect even a greater improvement in test time reduction.





Figure 3.33 Histograms for 0.566 ns VDL (a) theoretical (b) experimental



Figure 3.34 Histograms for 1.22 ns VDL (a) theoretical (b) experimental

VDL Used Test Time

0.566 ns-resolution VDL 196635 clock cycle

1.22 ns-resolution VDL 96235 clock cycles

Both VDLs 81960 clock cycles

Table. 5 - Test time reduction (peak-to-peak jitter of 45 ns)

# 3.8 - Limiting Factors

Although very high timing resolution can be obtained by tuning the oscillation periods of the clock and data delay lines, the maximum resolution is limited by some basic characteristics of the VDL circuit. In this design, we saw significant detection errors when the timing resolution was driven below 15 ps. Hence, we limited our timing resolution to be no less than 19.5 ps. Below are several reasons for this error.

#### 3.8.1 - Metastability in the D-Latches

As resolution increases, the probability that the setup time between clock and data signals in the phase detector is violated becomes higher. This results in exponentially increasing propagation delay and may drive the circuit into a metastable state. Although, careful design and layout of the D-latches can be used to minimize this effect, it cannot be eliminated entirely [13]. The probability of failure is usually characterized by a Mean-Time-Between Failure (MTBF) measure which can be approximated in terms of the VDL design parameters as follows:

$$MTBF \approx \frac{T_f \cdot e^{(T_f - t_{su} - t_d)/\tau}}{2 \cdot R \cdot T_w}$$
(3.21)

Here  $T_f$  is the period of the clock triggered oscillator,  $\tau$ ,  $t_d$  and  $t_{su}$  is the time constant, maximum propagation delay and setup time of the D-latch, respectively, R is the rate at which the data changes and  $T_w$  is the metastability window parameter expressed in seconds [14][15]. Here  $T_w$  is defined as the minimum time difference between the rising edge of the clock and data signals such that no metastability effect occur. In the case of our

component-invariant VDL, R can be expressed as  $\frac{1}{\Delta T}$  since the metastability event can only occur within the time  $\Delta T$ . Therefore, (3.21) can be written as

$$MTBF \approx \frac{T_f \cdot \Delta T \cdot e^{(T_f - t_{su} - t_d)/\tau}}{2 \cdot T_w}$$
(3.22)

One can observe from (3.22) that the MTBF will improve with a larger  $T_f$  and smaller metastability window  $T_w$ . Unfortunately, the MTBF decreases with finer timing resolution  $\Delta T$ .

As a first-order approximation, the change in MTBF can be expressed as

$$\left| \frac{\delta MTBF}{MTBF} \right| \approx \frac{\delta(\Delta T)}{\Delta T}$$
 (3.23)

For a change of  $\Delta T = 20$  ps to  $\Delta T = 15$  ps, we expect to see  $\left|\frac{\Delta MTBF}{MTBF}\right| \approx 25\%$ . This supports, at the very least, what we have experimentally observed.

#### 3.8.2 - Resolution limited by noise

Another limiting factor is the internal noise from the power supply and substrate that couples in from the analog and digital grounds of the circuit [16]. The variation in the delay for a single buffer due to a change in power supply is summarized in Figure 3.35. Although the sensitivity of a single delay cell to a power supply variation is quite small, this error will accumulate as more stages are incorporated into the design. Clearly in order to obtain high resolution, the power supply must be relatively clean during the test phase. In practice, this can be achieved by limiting the electronic activity to only those cells that are under test.



Figure 3.35 Delay versus power supply

#### 3.8.3 - Noise free clock

Although this design does not require a jitter-free clock during calibration, it does require a clean clock in order to perform a measurement during its measurement phase. This is a fundamental constraint of this design.

# 3.9 - Summary

In this chapter, a novel circuit technique was described which can overcome the drawbacks of the present state-of-the-art circuit techniques for measuring jitter. As a first proof of concept, the design was implemented and tested using a 0.18  $\mu$ m CMOS IC as well as an Altera MAX7000 FPGA. However, like all circuits, this particular design is limited by latch metastability and clock and power supply noise issues.

# Chapter 4 - Jitter Characterization for Frequency Domain Parameter

#### 4.1 - Introduction

As described in the previous chapter, jitter is typically described in terms of its time domain characteristics. Another way of characterizing jitter is to describe its frequency domain distribution. In this chapter, the method and the algorithm for frequency domain characterization will be described. Using time-domain measurements made with the prototype IC of the previous chapter, an example phase spectrum will be provided. Subsequently, a prototype FFT core was designed and fabricated in a 0.35  $\mu$ m CMOS process to aid on-chip signal processing. Circuit details will be provided in this chapter. Finally, design limitations will be described.

### 4.2 - Frequency Characterization Method

Timing jitter can be modelled as a noise signal phase-modulating a perfect digital signal as shown in Figure 4.1. Here a single sinusoidal signal is seen modulating a jitter-free digital signal. The deviation of the edge position of the digital signal from its ideal position is based on the instantaneous value of the modulating signal. In other words,

samples of the modulating waveform give rise to the phase modulated or jittery signal. Assuming that the average test time per sample is  $T_{\text{test-per-sample}}$ , as described by (3.15), the component-invariant VDL can sample the input signal at a frequency of  $F_s \approx \frac{1}{T_{test-per-sample}}$ . Further assume that the highest frequency component in the jitter signal is less than  $F_S/2$ , then a Discrete Fourier Transform (DFT) or Fast Fourier Transform (FFT) [17] of the jitter signal can be used to approximate the frequency spectrum of the jitter. As shown in Figure 4.2(a), the sampling rate is high enough so that no frequency aliasing will occur. However, if this condition is not met as shown in Figure 4.2(b), aliasing effects will occur and distort the spectral distribution. It is therefore important to have some idea of the spectral distribution of the jitter that one is investigating in order for this approach to be effective.



Figure 4.1 Phase modulation by noise



Figure 4.2 Effect of (a) higher sampling rate (b) lower sampling rate

# 4.3 - Experimental Result Using VDL Circuit

To demonstrate the ability that frequency domain characterization can be performed, an FFT can be performed on the time domain data obtained from the component-invariant VDL circuit. In Figure 4.3 we compare the spectral distribution of a sinusoidal phase-modulated clock signal synthesized by a Teradyne A567 mixed-signal tester and sampled with a component-invariant VDL circuit with its ideal spectral distribution. Specifically, a 1.56 MHz clock signal was generated from the tester with an equivalent phase modulation from a 90.3069 Hz sinusoidal signal having a 1 V amplitude.

The resulting spectrum X(n) is shown with a dotted line in Figure 4.3 as a function of the FFT bin index. This signal was then synthesized using the A567 tester and applied to the input of a component-invariant VDL tuned to a time resolution of 18.9 ps. Subsequently, 4096 samples were captured at an effective sampling rate of 16.0825 kHz and an FFT was then performed on the sample set. The resulting spectrum Y(n) represented by a solid line is also shown in Figure 4.3. As is evident, the results agree reasonably well. The difference between the actual and expected outputs is due to the jitter contributed by the tester and the on-chip VDL.



Figure 4.3 A comparison of the frequency spectrum of a sinusoidal distributed jitter before and after the VDL

# 4.4 - Exploiting Hardware Requirement

#### 4.4.1 - Discrete Fourier Transform Algorithm

As shown from the previous section that FFT can be used to extract the frequency characteristics of a jittery signal, let us exploit the hardware requirement of this signal processing by first looking at a simple way of implementing the FFT algorithm on hardware. For a discrete time signal x(n), X(k), the DFT of x(n), can be described as

$$X(k) = \sum_{n=0}^{N-1} x(n)e^{-j2\pi kn/N}, k = 0, 1, 2, ...., N-1$$
(4.1)

In general, for  $x(n) = x_R(n) + jx_I(n)$ ,  $X(k) = X_R(k) + jX_I(k)$  can be expressed as

$$X_{R}(k) = \sum_{n=0}^{N-1} \left[ x_{R}(n) \cdot \cos \frac{2\pi kn}{N} + x_{I}(n) \cdot \sin \frac{2\pi kn}{N} \right]$$
 (4.2)

$$X_{I}(k) = -\sum_{n=0}^{N-1} \left[ x_{R}(n) \cdot \sin \frac{2\pi kn}{N} - x_{I}(n) \cdot \cos \frac{2\pi kn}{N} \right]$$
 (4.3)

For real signal  $x(n) = x_R(n)$ , X(k) can be simplified as

$$X_R(k) = \sum_{n=0}^{N-1} \left[ x(n) \cdot \cos \frac{2\pi kn}{N} \right]$$
 (4.4)

$$X_{I}(k) = -\sum_{n=0}^{N-1} \left[ x(n) \cdot \sin \frac{2\pi kn}{N} \right]$$
 (4.5)

By comparing (4.4) and (4.5), the DFT algorithm can be easily implemented in hardware. Since  $\cos \frac{2\pi kn}{N}$  and  $\sin \frac{2\pi kn}{N}$  are always the same for the same k, n and N, these can be easily implemented using lookup table in some memories. The main hardware that is required is a single adder and a single multiplier.

## 4.4.2 - Circuit Implementation

Figure 4.4 shows the block diagram of the circuit implementation of a 64-point DFT algorithm. The circuit can be divided into five blocks: ROM, RAM, Control, Multipler/Adder and Output.



Figure 4.4 Main Blocks

#### a) ROM Circuit

The sine and cosine coefficients shown in (4.4) and (4.5) are pre-calculated using MATLAB and stored on the 16 bit x 8192 word ROM which can be synthesized using standard logic. Selrom is the address select lines controlled by the controller that selects a particular sine or cosine coefficient (labelled as 'romdata') as shown in Figure 4.4.

#### b) RAM Circuit

The detail of the RAM circuit is shown in Figure 4.5.



Figure 4.5 RAM Block

The internal RAM is from a cell library while the rest of the circuit is synthesized from standard logic. The input data is first loaded through a 16-bit serial-to-parallel shift register into a 16 bit x 64 word RAM after a reset condition. The serial-to-parallel conversion is to minimize the Input/Output (I/O) pin counts on the chip. The small controller is used to control the read and write action to avoid read-write conflict. The following table summarizes the read-write action:

inmode ramfull input enable output enable 0 0 (write) 0 (not full) 1 (write data to RAM) 1 (read) 0 (not full) 1 (write data to RAM) 0 1 (full) 0 0 (write) 0 1 (read) 1 (full) 0 1 (read data from RAM)

Table. 6 - Input Mode Configuration

When 'reset' and 'inmode' is low (i.e. attempting to write data into RAM), the RAM controller checks to see if the RAM is empty. If the RAM is empty (i.e. "ramfull" is low), 'write\_enable' becomes high (i.e. write action is active) and the serial-to-parallel shift-

register starts shifting the 16-bit word into the RAM. The addressing of the RAM ('addr') is incremented automatically from 0 to 63. This addressing is controlled by the RAM controller. The write action will continue until the RAM is full (i.e. write action is completed). When write action is completed, 'ramfull' will go high and "write\_enable" will go low (i.e. write is deactivated until another reset is triggered).

When 'inmode' is high (i.e. attempting to read data from RAM), the controller checks to see if the RAM is full. If the RAM is full, 'read\_enable' becomes high (i.e. read is activated). The RAM address lines ('addr') now takes the addressing from 'selsig' which is the address of the data to be read. If RAM is not full and read is requested, read action will not start until the write action is completed.

#### c) Multiplier and Adder Circuit

After the data are loaded into the RAM, DFT algorithm can be performed through the multiplier and adder. In order to preserve arithmetic precision, the multiplier and adder have 24-bit data width output instead of 16-bit. The internal data path of the Multiplier/Adder is shown in Figure 4.6. The 'Addbits' block first patches zeros to widen the data widths to ensure precision during arithmetic operation. The small controller controls the 'clearsum' signal which clears the content of the registers of the adder and the 'reset' signal. The calculated result is then latched out as a 24-bit wide data. The data representation is summarized as shown in Table 7.



Figure 4.6 Mult/Add Block

 Data Width
 Format

 16 bits
  $-2^0 + 2^{-1} + 2^{-2} + \dots + 2^{-14} + 2^{-15}$  

 24 bits
  $-2^0 + 2^{-1} + 2^{-2} + \dots + 2^{-14} + 2^{-15} + \dots + 2^{-30} + 2^{-31}$  

 40 bits
  $-2^8 + 2^7 + \dots + 2^0 + 2^{-1} + 2^{-2} + \dots + 2^{-14} + 2^{-15} + \dots + 2^{-30} + 2^{-31}$  

 24 bits
  $-2^8 + 2^7 + \dots + 2^0 + 2^{-1} + 2^{-2} + \dots + 2^{-14} + 2^{-15}$ 

Table. 7 - Data Representation

#### d) Main Controller Circuit

A main controller (shown in Figure 4.4) is used to control the addressing of the ROM and RAM during input and output condition. 'outmode' controls the output mode of the design as summarized in Table 8. 'binnum' specifies the frequency bin to be calculated (when outmode is set to 0). 'selmux' specifies whether the output is the real or imaginary part of the bin. When 'selmux' is low, the output data is the real part of the bin and when 'selmux' is high, the output data is the imaginary part of the bin. 'selsig' is the required address location for the RAM during a read action. 'selrom' is the required address location for the ROM during read and write action. 'clearsum' clears the content of the multipler/adder. 'latchout' latches data to the output module.

Table. 8 - Output Mode configuration

| outmode | 0                                                                   | 1                                                                 |
|---------|---------------------------------------------------------------------|-------------------------------------------------------------------|
| Action  | Calculates the real and imaginary values of one bin based on binnum | Calculates all the real and imagi-<br>nary values of all the bins |

#### e) Output Circuit

The result is presented serially as the real and imaginary part described in (4.4) and (4.5). As a feature of the design, the output can be in an increasing value of 'k' from 1 to 64 described in (4.4) and (4.5) or a selected value of 'k'. The latter is controlled by setting an external input 'binnum'. The incoming data is shifted serially out to either

'realdata' (selmux is low) or 'imgdata' (selmux is high) while 'synbit' is low during valid output data.

## 4.4.3 - Experimental Result from DFT IC

Figure 4.7 shows the chip micrograph of the implementation fabricated on a 0.35  $\mu m$  CMOS process. Most of the layout of this design is generated using automatic place-and-route tool using standard cells. The relative size of the component is summarized in Table 9. The design is mostly occupied by interconnects.



Figure 4.7 FFT Chip Micrograph

Table. 9 - Component Size for DFT Chip

| Component  | Size                 |
|------------|----------------------|
| ROM        | 0.46 mm <sup>2</sup> |
| Output     | $0.02 \text{ mm}^2$  |
| Mult/Add   | 0.99 mm <sup>2</sup> |
| Input      | 0.06 mm <sup>2</sup> |
| RAM        | 0.17 mm <sup>2</sup> |
| Controller | 0.13 mm <sup>2</sup> |
| Total      | 1.83 mm <sup>2</sup> |

.

The algorithm used was first simulated on MATLAB under limited precision condition. A two-tone signal generated from MATLAB and passed through the algorithm. Figure 4.8 shows the simulated result. The noise floor is the limited precision of the arithmetic operation. The circuit was also simulated with the same two tone signal generated from MATLAB. The result shown in Figure 4.9 correlates with the MATLAB simulation. The fabricated IC was also tested with the same two tone signal as the input. The same plot as shown in Figure 4.9 was obtained.



Figure 4.8 DFT algorithm from MATLAB



Figure 4.9 DFT algorithm from chip

# 4.5 - Limiting Factors

### 4.5.1 - Effective Sampling Rate

As stated in the previous section, the sampling rate of the modulating signal (i.e. the jitter) is set by the frequency at which the phase of the digital signal are being taken. The highest sampling rate one can obtain is to collect all the phase samples from cycle to cycle. However, due to test time requirement, this may not be the case. Test time of each sample is

$$T_{sample} = T_f \bullet N \tag{4.6}$$

where  $T_f$  is the period of the clock oscillator and N is the number of oscillations for the data oscillator before phase detect. For maximum test time,

$$N = \frac{T_s}{\Delta T} \tag{4.7}$$

where  $\Delta T$  is the timing resolution. Therefore, the maximum sampling rate depends on N. However, if one can predict or limit the amount of jitter (i.e. the size of N), the maximum number of oscillations before phase detect occurs can be predicted. Hence, the effective sampling rate can be increased. Table 10 summarizes the relationship between effective sampling rate and test time.

 Max. Peak-to-Peak Jitter
 Test Time
 Equivalent Max. Sampling Rate

 2 ns
 145 ns
 6.9 MHz

 1 ns
 73 ns
 13.8 MHz

 500 ps
 36 ns
 28 MHz

 250 ps
 18 ns
 55 MHz

Table. 10 - Effective Sampling Rates

## 4.5.2 - Silicon Sizing

As the design space is mainly constraint by interconnect, the same design using a  $0.18~\mu m$  standard CMOS process has 6 layers of metal will be much smaller as opposed to  $0.35~\mu m$  standard CMOS process which has only 3 layers of metal.

# 4.6 - Summary

In this chapter we developed an algorithm for capturing the frequency-domain description of jitter from a set of time samples obtained from the component-invariant VDL. In addition, another IC was implemented that executed a 64-point DFT algorithm to aid in the post-processing of the time samples.

# **Chapter 5 - Conclusion**

## 5.1 - Conclusion

A high-resolution timing measurement circuit based on a component-invariant or single-stage VDL structure is developed. This circuit is synthesizable whereby precise element matching is not required. The design was implemented on an FPGA as well as a custom IC as a first proof of concept. Experimental results confirm the validity of this approach. In the case of the FPGA implementation, the design requires about 128 macrocells. In the case of an IC implementation in a 0.18 micron CMOS process, the circuit occupies a total silicon area of 0.12 mm², while achieving a 18.9 ps timing resolution. Although the design is not limited by precise element matching, the frequency characteristics of the D-type flip-flop limits the maximum timing resolution acheivable.

Frequency characterization method and additional circuit was also implemented on a 0.35  $\mu$ m CMOS process which occupies a silicon area of 1.83 mm<sup>2</sup>. If this design were to be imported to 0.18  $\mu$ m CMOS process, it would occupy a much smaller space.

The two circuits together show the validity of the Jitter Measurement Unit whereby pico-second timing and frequency measurements can be achieved with a small silicon size using synthesizable logic.

# 5.2 - Future Works

Since both the DFT and component-invariant VDL designs are synthesizable, it would appear that one element of the future work would include integrating the two designs together into a single IC. In addition, it would be useful to have the ability to provide RMS, peak-to-peak and frequency-domain extraction capability all on-chip in order to have a complete Jitter Analyzing Circuit.

Conclusion 72

# References

| [1] | M. Mota, J. Christiansen, "A four-channel self-calibrating high-resolution time to digital converter", <i>IEEE International Conference on Electronics</i> , <i>Circuits and Systems</i> , pp 409-412, 1998        |
|-----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [2] | M.S. Gorbics, J. Kelly, K.M. Roberts, R.L. Sumner, "A high resolution multihit time to digital converter integrated circuit", <i>IEEE Transactions on Nuclear Science</i> , vol.44 no.3, pp 379 -384, 1997         |
| [3] | Y. Arai, M. Ikeno, M. Sagara and T. Emura, "Time memory cell VLSI for<br>the PHENIX drift chamber", <i>IEEE Transaction on Nuclear Science</i> , Vol.45<br>no.3, pp 735-739, 1998                                  |
| [4] | E.Raisanen-Rnotsalainen, T. Rahkonen, J. Kostamovaara, "Time interval digitization with an integrated 32-phase oscillator", <i>IEEE International Symposium on Circuits and Systems</i> , vol.5, pp 673 -676, 1994 |
| [5] | S. Sunter, A. Roy, "BIST for phase-locked loops in digital applications",<br>Proc. IEEE International Test Conference, pp. 532 -540, 1999                                                                          |
| [6] | N. Abaskharoun and G. W. Roberts, "Circuits for on-chip sub-nanosecond signal capture and characterization," <i>Proc. IEEE Custom Integrated Circuits Conference</i> , pp. 251 -254, 2001                          |
| [7] | P. Dudek, S. Szczepanski, J.V. Hatfield, "A high-resolution CMOS time-to-digital converter utilizing a vernier delay line" <i>IEEE Journal of Solid-State Circuits</i> , vol.35 no.2, pp. 240 -247, 2000           |
| [8] | B.D. Kulp, "Testing and characterizing jitter in 100BASE-TX and 155.52 Mbit/s ATM devices with a 1 Gsamples/s AWG in an ATE system", <i>Proc. IEEE International Test Conference</i> , pp. 104-111, 1996           |

- [9] A. Chan and G.W. Roberts, "A synthesizable, fast and high-resolution timing measurement device using a component-invariant vernier delay line", *Proc. International Test Conference*, pp 858 -867, 2001
- [10] A. Chan and G.W. Roberts, "A Deep Sub-Micron Timing Measurement Circuit Using a Single-Stage Vernier Delay Line", *Proc. IEEE Custom Integrated Circuits Conference*, pp 77-80, 2002
- [11] Teradyne Inc., "High-Speed Digital Programming", Tester Manual, Version 6.2, 1996
- [12] WAVECREST Corporation, "User's Guide and Reference Manual for DTS-2079, DTS-2077, DTS-2075", 207900-01 REV A, 2000
- [13] T.A. Jackson and A. Albicki, "Analysis of metastability operations in D-latches", *IEEE Transaction on Circuits and Systems*, Vol.36 no.11, pp 1392-1404, 1989
- [14] T.C. Tang, "Experimental studies of metastability behavior of sub-micron CMOS ASIC flip flops", *Proc. IEEE International ASIC Conference and Exhibit*, pp P7 -4/1-4, 1991
- [15] C. Foley, "Characterizing Metastability", Proc. IEEE International Symposium on Advanced Research in Asynchronous Circuits and Systems, pp 175-184, 1996
- [16] F. Herzel and B. Razavi, "A study of oscillator jitter due to supply and substrate noise", *Proc. IEEE Transactions on Circuits and Systems II:*Analog and Digital Signal Processing, Vol.46 no.1, pp 56-62, 1999
- [17] J. Proakis and D. Manolakis, "Digital Signal Processing: Principles, Algorithm, and Applications", *Prentice Hall*, 3<sup>rd</sup> Edition, 1996