Multiple Clock Domains MCD Arvind with Nirav Dave

  • Slides: 40
Download presentation
Multiple Clock Domains (MCD) Arvind with Nirav Dave Computer Science & Artificial Intelligence Lab

Multiple Clock Domains (MCD) Arvind with Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -1

Plan Why Multiple Clock Domains n 802. 11 a as an example How to

Plan Why Multiple Clock Domains n 802. 11 a as an example How to represent multiple clocks in Bluespec MCD Syntax Revisit 802. 11 a Synchronizers March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -2

Why Multiple Clock Domains Arise naturally in interfacing with the outside world Needed to

Why Multiple Clock Domains Arise naturally in interfacing with the outside world Needed to manage clock skew Allow parts of the design to be isolated to do selective power gating and clock gating Reduce power and energy consumption March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -3

Review: 802. 11 a Transmitter Operates on 1 -4 packets at a time headers

Review: 802. 11 a Transmitter Operates on 1 -4 packets at a time headers 24 Uncoded bits Controller data Scrambler Interleaver IFFT A lot of compute; n cycles/packet March 15, 2010 Encoder Mapper Converts 1 -4 packets into a single packet Cyclic Extend n depends on the IFFT implementation; for the superfolded version n 51 http: //csg. csail. mit. edu/6. 375 L 12 -4

Synthesis results for different microachitectures Design Area (mm 2) Best CLK Period Throughput CLK/symbol

Synthesis results for different microachitectures Design Area (mm 2) Best CLK Period Throughput CLK/symbol Latency Comb. 1. 03 15 ns 1 15 ns Pipelined 1. 46 7 ns 1 21 ns Folded 0. 83 8 ns 3 24 ns S Folded 1 Radix 0. 23 8 ns 48 -51 408 ns For the same throughput SF has to run ~16 times faster than F TSMC. 13 micron; numbers reported are before place and route. Single radix-4 node design is ¼ the size of combinational design but still meets the throughput requirement easily; clock can be reduced to 15 - 20 Mhz Dave, Pellauer, Ng 2005 March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -5

Rate Matching Clock speed headers Controller data Scrambler Encoder f f/13 Interleaver Mapper IFFT

Rate Matching Clock speed headers Controller data Scrambler Encoder f f/13 Interleaver Mapper IFFT Cyclic Extend f/52 After the design you may discover the clocks of many boxes can be lowered without affecting the overall performance March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -6

Power-Area tradeoff The power equation: P = ½ CV 2 f n V and

Power-Area tradeoff The power equation: P = ½ CV 2 f n V and f are not independent; one can lower the f by lowering V – linear in some limited range Typically we run the whole circuit at one voltage but can run different parts at different frequencies We can often increase the area, i. e. , exploit more parallelism, and lower the frequency (power) for the same One would actually want to explore many relative frequency performance partitionings to determine the real area/power tradeoff March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -7

Plan Why Multiple Clock Domains n 802. 11 a as an example How to

Plan Why Multiple Clock Domains n 802. 11 a as an example How to represent multiple clocks in Bluespec MCD Syntax Revisit 802. 11 a Synchronizers March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -8

Associating circuit parts with a particular clock Two choices to split the design: Partition

Associating circuit parts with a particular clock Two choices to split the design: Partition State n Rules must operate in multiple domains Partition Rules n State Elements must have methods in different clock domains It is very difficult to maintain rule atomicity with multi-clock rules. Therefore we would not examine “Partitioned State” approach further March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -9

Partitioning Rules A method in each domain Methods in red and green domains Only

Partitioning Rules A method in each domain Methods in red and green domains Only touched by one domain March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -10

Handling Module Hierarchy Methods added to expose needed functionality March 15, 2010 http: //csg.

Handling Module Hierarchy Methods added to expose needed functionality March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -11

We need a primitive MCD synchronizer: for example FIFO enq one on clock and

We need a primitive MCD synchronizer: for example FIFO enq one on clock and deq/first/pop on another full/empty signals are conservative approximations n may not be full when full signal is true We’ll discuss implementations later March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -12

Back to the Transmitters headers Controller March 15, 2010 data Scrambler Encoder Interleaver Mapper

Back to the Transmitters headers Controller March 15, 2010 data Scrambler Encoder Interleaver Mapper IFFT Cyclic Extend http: //csg. csail. mit. edu/6. 375 L 12 -13

Domains in the Transmitter let controller <- mk. Controller(); let scrambler <- mk. Scrambler_48();

Domains in the Transmitter let controller <- mk. Controller(); let scrambler <- mk. Scrambler_48(); These colors let conv_encoder <- mk. Conv. Encoder_24_48(); are just to let interleaver <- mk. Interleaver(); remind us let mapper <- mk. Mapper_48_64(); about domains let ifft <- mk. IFFT_Pipe(); let cyc_extender <- mk. Cyclic. Extender(); rule controller 2 scrambler(True); stitch(controller. get. Data, scrambler. from. Control); endrule … many such stitch rules … function Action stitch(Action. Value#(a) x, function Action f(a v)); action let v <- x; f(v); endaction endfunction March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -14

Coloring the rules? All methods in the same domain rule controller 2 scrambler(True); stitch(controller.

Coloring the rules? All methods in the same domain rule controller 2 scrambler(True); stitch(controller. get. Data, scrambler. from. Control); endrule scrambler 2 conv. Enc(True); stitch(scrambler. get. Data, conv_encoder. put. Data); endrule Using different domains… rule mapper 2 ifft(True); stitch(mapper. to. IFFT, ifft. from. Mapper); endrule March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -15

Domain Crossing rule mapper 2 ifft(True); stitch(mapper. to. IFFT, ifft. from. Mapper); endrule inline

Domain Crossing rule mapper 2 ifft(True); stitch(mapper. to. IFFT, ifft. from. Mapper); endrule inline stitch rule mapper 2 ifft(True); let x <- mapper. to. IFFT(); ifft. from. Mapper(x) endrule Different methods in an action are on different clocks – we need to change the clock domains March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -16

Introduce a domain crossing module let m 2 ifft. FF <- mk. Sync. FIFO(size,

Introduce a domain crossing module let m 2 ifft. FF <- mk. Sync. FIFO(size, clk. Green, clk. Red); Many such synchronizers In real syntax, one clock value is passed implicitly March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -17

Fixing the Domain Crossing rule mapper 2 ifft(True); let x <- mapper. to. IFFT();

Fixing the Domain Crossing rule mapper 2 ifft(True); let x <- mapper. to. IFFT(); ifft. from. Mapper(x) endrule split rule mapper 2 fifo(True); stitch(mapper. to. IFFT, m 2 ifft. FF. enq); endrule fifo 2 ifft(True); stitch(pop(m 2 ifft. FF), ifft. from. Mapper); endrule let m 2 ifft. FF <- mk. Sync. FIFO(size, clk. Green, clk. Red); synchronizer syntax is not quite correct March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -18

Similarly for IFFT to Cyclic. Ext let ifft 2 ce. FF <- mk. Sync.

Similarly for IFFT to Cyclic. Ext let ifft 2 ce. FF <- mk. Sync. FIFO(size, clk. Red, clk. Blue); rule ifft 2 ff(True); stitch(ifft. to. Cyclic. Extender, ifft 2 ce. FF. enq); endrule ff 2 cyclic. Extender(True); stitch(pop(ifft 2 ce. FF), cyc_extender. from. IFFT); endrule Now each rule is associated with exactly one clock domain! March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -19

Plan Why Multiple Clock Domains n 802. 11 a as an example How to

Plan Why Multiple Clock Domains n 802. 11 a as an example How to represent multiple clocks in Bluespec MCD Syntax Revisit 802. 11 a Synchronizers March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -20

How to introduce clocks module mk. Transmitter(Transmitter#(24, 81)); let let mk. Controller(); mk. Scrambler_48();

How to introduce clocks module mk. Transmitter(Transmitter#(24, 81)); let let mk. Controller(); mk. Scrambler_48(); mk. Conv. Encoder_24_48(); mk. Interleaver(); mk. Mapper_48_64(); How should we mk. IFFT_Pipe(); mk. Cyclic. Extender(); 1. Generate different clocks? // rules to stitch these modules together 2. Pass them to modules? 3. Introduce clock synchronizers and fix the rules? March 15, 2010 controller scrambler conv_encoder interleaver mapper ifft cyc_extender <<<<<<<- http: //csg. csail. mit. edu/6. 375 L 12 -21

Instantiating modules with clocks (clock is a type) Synthesized modules have an input port

Instantiating modules with clocks (clock is a type) Synthesized modules have an input port called CLK, which is passed to all interior instantiated modules by default However, any module can be instantiating with an explicit clock Clock c = … ; Reg# (Bool) b <- mk. Reg (True, clocked_by c); Modules can also take clocks as ordinary arguments, to be fed to interior module instantiations March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -22

The clock. Of() function makes the implicit clock explicit Reg# (UInt# (17)) x <-

The clock. Of() function makes the implicit clock explicit Reg# (UInt# (17)) x <- mk. Reg (0, clocked_by c); Clock c 0 <- expose. Current. Clock; let y = x + 2; Clock c 1 = clock. Of (x); Clock c 2 = clock. Of (y); c, c 0, c 1 and c 2 are all equal Can be used interchangeably for all purposes If the expression is a constant, the result is the special value no. Clock n March 15, 2010 no. Clock values can be used on in any domain http: //csg. csail. mit. edu/6. 375 L 12 -23

Clock Dividers interface Clock. Divider. Ifc ; interface Clock fast. Clock ; // original

Clock Dividers interface Clock. Divider. Ifc ; interface Clock fast. Clock ; // original clock interface Clock slow. Clock ; // derived clock method Bool clock. Ready ; endinterface module mk. Clock. Divider #( Integer divisor ) ( Clock. Divider. Ifc ifc ) ; Divisor = 3 Fast CLK Slow CLK rdy March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -24

Plan Why Multiple Clock Domains n 802. 11 a as an example How to

Plan Why Multiple Clock Domains n 802. 11 a as an example How to represent multiple clocks in Bluespec MCD Syntax Revisit 802. 11 a Synchronizers March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -25

Step 1: Introduce Clocks module mk. Transmitter(Transmitter#(24, 81)); let. . . let let clockdiv

Step 1: Introduce Clocks module mk. Transmitter(Transmitter#(24, 81)); let. . . let let clockdiv 13 clockdiv 52 clk 13 = clk 52 = let let controller scrambler conv_encoder interleaver mapper ifft cyc_extender <- mk. Clock. Divider(13); <- mk. Clock. Divider(52); clockdiv 13. slow. Clock; clockdiv 52. slow. Clock; <<<<<<<- mk. Controller(); mk. Scrambler_48(); mk. Conv. Encoder_24_48(); mk. Interleaver(); mk. Mapper_48_64(); mk. IFFT_Pipe(); mk. Cyclic. Extender(); // rules to stitch these modules together March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -26

Step 2: Pass the Clocks module mk. Transmitter(Transmitter#(24, 81)); let clockdiv 13 <- mk.

Step 2: Pass the Clocks module mk. Transmitter(Transmitter#(24, 81)); let clockdiv 13 <- mk. Clock. Divider(13); let clockdiv 52 <- mk. Clock. Divider(52); let clk 13 = clockdiv 13. slow. Clock; let clk 52 = clockdiv 52. slow. Clock; let controller <- mk. Controller(clocked_by clk 13); let scrambler <- mk. Scrambler_48(clocked_by clk 13); let conv_encoder <- mk. Conv. Encoder_24_48(clocked_by clk 13); let interleaver <- mk. Interleaver(clocked_by clk 13); let mapper <- mk. Mapper_48_64(clocked_by clk 13); let ifft <- mk. IFFT_Pipe(); Default Clock let cyc_extender <- mk. Cyclic. Extender(clocked_by clk 52); // rules to stitch these modules together Now some of the stitch rules have become illegal because they call methods from different clock families Introduce Clock Synchronizers March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -27

Step 3: Introduce Clock Synchronizers module mk. Transmitter(Transmitter#(24, 81)); let m 2 ifft. FF

Step 3: Introduce Clock Synchronizers module mk. Transmitter(Transmitter#(24, 81)); let m 2 ifft. FF <- mk. Sync. FIFOTo. Fast(2, clockdiv 13); let ifft 2 ce. SF <- mk. Sync. FIFOTo. Slow(2, clockdiv 52); … let mapper <- mk. Mapper_48_64(clocked_by clk 13); let ifft <- mk. IFFT_Pipe(); let cyc_extender <- mk. Cyclic. Extender(clocked_by clk 52); rule mapper 2 fifo(True); //split mapper 2 ifft rule stitch(mapper. to. IFFT, m 2 ifft. FF. enq); endrule fifo 2 ifft(True); stitch(pop(m 2 ifft. FF), ifft. from. Mapper); endrule ifft 2 fifo(True); //split ifft 2 ce rule stitch(ifft. to. Cyc. Extend, ifft 2 ce. FF. enq); endrule fifo 2 ce(True); stitch(pop(ifft 2 ce. FF), cyc_extender. from. IFFT); endrule March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -28

Did not work. . . stoy@forte: ~/examples/80211$ bsc -u -verilog Transmitter. bsv Error: ".

Did not work. . . stoy@forte: ~/examples/80211$ bsc -u -verilog Transmitter. bsv Error: ". /Interfaces. bi", line 62, column 15: (G 0045) Method get. From. MAC is unusable because it is connected to a clock not available at the module boundary. Need to fix the Transmitter’s interface so that the outside world knows about the clocks that the interface methods operate on. (These clocks were defined inside the module) March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -29

The Fix – pass the clocks out interface Transmitter#(type in. N, type out); method

The Fix – pass the clocks out interface Transmitter#(type in. N, type out); method Action get. From. MAC(TXMAC 2 Controller. Info x); method Action get. Data. From. MAC(Data#(in. N) x); method Action. Value#(Msg. Complex. FVec#(out)) to. Analog. TX(); interface Clock endinterface March 15, 2010 clk. MAC; clk. Analog; http: //csg. csail. mit. edu/6. 375 L 12 -30

Clock Summary The Clock type, and type checking ensures that all circuits are clocked

Clock Summary The Clock type, and type checking ensures that all circuits are clocked by actual clocks BSV provides ways to create, derive and manipulate clocks, safely BSV clocks are gated, and gating fits into Ruleenabling semantics (clock guards) BSV provides a full set of speed-independent data synchronizers, already tested and verified n The user can define new synchronizers BSV precludes unsynchronized domain crossings March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -31

Clock Synchronizers March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -32

Clock Synchronizers March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -32

Moving Data Across Clock Domains Data moved across clock domains appears asynchronous to the

Moving Data Across Clock Domains Data moved across clock domains appears asynchronous to the receiving (destination) domain Asynchronous data will cause meta-stability The only safe way: use a synchronizer clk d Setup & hold q March 15, 2010 violation Meta-stable data http: //csg. csail. mit. edu/6. 375 L 12 -33

Synchronizers Good synchronizer design and use reduces the probability of observing meta -stable data

Synchronizers Good synchronizer design and use reduces the probability of observing meta -stable data Bluespec delivers conservative (speed independent) synchronizers User can define and use new synchronizers Bluespec does not allow unsynchronized crossings (compiler static checking error) March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -34

2 - Flop BIT-Synchronizer Most common type of (bit) synchronizer FF 1 will go

2 - Flop BIT-Synchronizer Most common type of (bit) synchronizer FF 1 will go meta-stable, but FF 2 does not look at data until a clock period later, giving FF 1 time to stabilize Limitations: n n When moving from fast to slow clocks data may be overrun Cannot synchronize words since bits may not be seen at same time s. DIN FF 0 FF 1 s. Clk March 15, 2010 http: //csg. csail. mit. edu/6. 375 FF 2 d. D_OUT d. Clk L 12 -35

Bluespec’s 2 -Flop Bit-Synchronizer mk. Sync. Bit send() FF 0 FF 1 FF 2

Bluespec’s 2 -Flop Bit-Synchronizer mk. Sync. Bit send() FF 0 FF 1 FF 2 s. Clk read() d. Clk interface Sync. Bit. Ifc ; method Action send ( Bit#(1) bit. Data ) ; method Bit#(1) read () ; endinterface The designer must follow the synchronizer design guidelines: n n March 15, 2010 No logic between FF 0 and FF 1 No access to FF 1’s output http: //csg. csail. mit. edu/6. 375 L 12 -36

Use example: MCD Counter Up/down counter: Increments when up_down_bit is one; the up_down_bit is

Use example: MCD Counter Up/down counter: Increments when up_down_bit is one; the up_down_bit is set from a different clock domain. Registers: Reg# (Bit#(1)) up_down_bit <mk. Reg(0, clocked_by ( write. Clk ) ); Reg# (Bit# (32)) cntr <- mk. Reg(0); // Default Clk The Rule (attempt 1): rule countup ( up_down_bit == 1 ) o; ck g l n C i cntr <= cntr + 1; al ross g endrule C Ille in a m March 15, 2010 http: //csg. csail. mit. edu/6. 375 Do L 12 -37

Adding the Synchronizer Sync. Bit. Ifc sync <- mk. Sync. Bit( write. Clk, write.

Adding the Synchronizer Sync. Bit. Ifc sync <- mk. Sync. Bit( write. Clk, write. Rst, current. Clk ) ; Split the rule into two rules where each rule operates in one clock domain clocked by write. Clk rule transfer ( True ) ; sync. send ( up_down_bit ); endrule clocked by current. Clk rule countup ( sync. read == 1 ) ; cntr <= cntr + 1; endrule March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -38

MCD Counter module mk. Top. Level#(Clock write. Clk, Reset write. Rst) (Top ifc); Reg#

MCD Counter module mk. Top. Level#(Clock write. Clk, Reset write. Rst) (Top ifc); Reg# (Bit# (1)) up_down_bit <- mk. Reg(0, clocked_by(write. Clk), reset_by(write. Rst)) ; Reg# (Bit# (32)) cntr <- mk. Reg (0) ; // Default Clocking Clock current. Clk <- expose. Current. Clock ; Sync. Bit. Ifc sync <- mk. Sync. Bit ( write. Clk, write. Rst, current. Clk ) ; rule transfer ( True ) ; sync. send( up_down_bit ); endrule countup ( sync. read == 1 ) ; cntr <= cntr + 1; endrule We won’t worry about resets for the rest of this lecture March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -39

Different Synchronizers Bit Synchronizer FIFO Synchronizer Pulse Synchronizer Word Synchronizer Asynchronous RAM Null Synchronizer

Different Synchronizers Bit Synchronizer FIFO Synchronizer Pulse Synchronizer Word Synchronizer Asynchronous RAM Null Synchronizer Reset Synchronizers Documented in Reference Guide March 15, 2010 http: //csg. csail. mit. edu/6. 375 L 12 -40