EE 108 A Lecture 13 Metastability and Synchronization
EE 108 A Lecture 13: Metastability and Synchronization Failure (or When Good Flip-Flops go Bad) 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 1
What happens when we violate setup and hold time constraints? 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 2
Look at structure of CMOS latch • Storage loop gets initialized with an ‘analog’ value 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 3
Storage loop has a metastable state between 0 and 1 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 4
Dynamics of DV 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 5
Dynamics of DV 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 6
Metastability Demonstration Circuit EE 108 A Lecture 13 (c) 2005 W. J. Dally
Metastability Demonstration Circuit - Implementation 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 8
Metastable state of FF 1 – 4007 Nand RS Latch 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 9
Over time the waveform fills in 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 10
A Brute-Force Synchronizer 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 11
What if AW is still in a metastable state when FF 2 is clocked? 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 12
Calculating Synchronization Failure (The Big Picture) P(failure) = P(enter metastable state) x P(still in state after tw) 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 13
Probability of Entering a Metastable State • FF 1 may enter the metastable state if the input signal transitions during the setup+hold window of the flip flop 11/9/2005 • Probability of a given transition being in the setup+hold window is the fraction of time that is setup+hold window EE 108 A Lecture 13 (c) 2005 W. J. Dally 14
Probability of Staying in the Metastable State • Still in metastable state if initial voltage difference was too small to be exponentially amplified during wait time 11/9/2005 • Probability of starting with this voltage is proportion of total voltage range that is ‘too small’ EE 108 A Lecture 13 (c) 2005 W. J. Dally 15
Failure Probability and Error Rate • • Each event can potentially fail. Failure rate = event rate x failure probability 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 16
Example • • ts = th = td. CQ = t =100 ps tcy = 2 ns must sample a f. E = 1 MHz asynchronous signal PE = (. 1+. 1)/2 = 0. 1 PS = exp(-1. 8/. 1) = exp(-18) = 1. 5 x 10 -8 PF = PSPE = 1. 5 x 10 -9 f. F = f. EPF = 1. 5 x 10 -3 • 1 failure every 656 seconds ~ every 11 minutes • This is not adequate. How do we improve it? • How do we get failure rate to one every 10 years ~ 3 x 108 s (f. F < 3 x 10 -9) 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 17
How much difference does one FF make? • Previous example: 2 FF brute-force synchronizer – 1 failure every 11 minutes (f. E = 1. 5 x 10 -3) • Add a third FF: – ts = th = td. CQ = t =100 ps (same) – tcy = 2 ns (same) – must sample a f. E = 1 MHz asynchronous signal (same) – PE = (. 1+. 1)/2 = 0. 1 (same) – PS = exp(-3. 6/. 1) = exp(-36) = 2. 3 x 10 -16 – PF = PSPE = 2. 3 x 10 -17 – f. F = f. EPF = 2. 3 x 10 -11 (much) less than one failure every 10 years! • Exponentials grow quickly. Adding one flip flop took us from 11 minutes to 1, 300 years. 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 18
Synchronizing multi-bit signals Consider a 4 -bit counter running on clk 1 you need the value of this counter sampled by clk 2. Will the following circuit work? (assume tw >> t) This happens, for example, in a FIFO where the head and tail pointers are in different clock domains. 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 19
Multi-bit signals (2) When synchronizing a multi-bit signal, each changing bit is independently synchronized Consider what happens on the 0111 to 1000 transition. All bits are changing. Each can independently fall either way. How do you fix this? 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 20
Each bit can fail either way! 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 21
Warning: The Surgeon General has determined that passing binary-coded and one-hot signals through a bruteforce synchronizer can be hazardous to your circuits. 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 22
Gray code: only one bit changes each time • How does this help? • Remember each bit can fail either way. • If we only change 1 bit each time, then what’s the worst that can happen? – 0111 => cycle 1: 0101, cycle 2: 0101 (no failure) – 0111 => cycle 1: 0111, cycle 2: 0101 (failure) • On the second cycle we will have had even more time for our input to stabilize so we should be fine. By using Gray code the worst that happens is we see the transistion 1 cycle later. 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally # # # # # # xxxx 0000 0001 0010 0111 0100 1101 1110 1011 1000 0001 0010 23
Why do we care? • • Most designs have multiple clock domains – I. e. , your PCI bus interface runs at 66 MHz, but your image compression engine might run at 200 MHz – You need to get data from the PCI bus to the image compression engine Example: DVD driver System-on-chip (So. C): Tsai, C. , et. al. , “A CMOS So. C for 56/32/56/16 Combo Driver Applications”, ISSC 2004 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 24
Metastability and Synchronization Failure Summary • • • Clocking a flip-flop during the “keepout” interval may leave the storage node in an “illegal state” Some “illegal states” are Metastable Time to decay to a legal state depends on log of initial voltage • Probability of entering metastable state is probability of hitting “keepout” interval. • Probability of staying in metastable state after time T is probability that initial voltage was too small to decay in time T • • Brute-force synchronizer – sample signal and wait for metastable states to decay. Don’t use on multi-bit signals unless they are Gray coded 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 25
The end of ee 108 a… • Any questions? 11/9/2005 EE 108 A Lecture 13 (c) 2005 W. J. Dally 26
- Slides: 26