Pipelining combinational circuits Arvind Computer Science Artificial Intelligence
Pipelining combinational circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -1
3 different datasets in the pipeline Combinational IFFT in 0 in 1 … x 16 Bfly 4 … out 0 Bfly 4 … out 1 Permute in 4 Bfly 4 IFFTi-1 Permute in 3 IFFTi Permute in 2 IFFTi+1 Bfly 4 in 63 Lot of area and long combinational delay Folded or multi-cycle version can save area and reduce the combinational delay but throughput per clock cycle gets worse Pipelining: a method to increase the circuit throughput by evaluating multiple IFFTs February 20, 2013 http: //csg. csail. mit. edu/6. 375 out 2 out 3 out 4 … out 63 L 05 -2
Inelastic vs Elastic pipeline f 0 f 1 f 2 x in. Q s. Reg 1 s. Reg 2 out. Q Inelastic: all pipeline stages move synchronously f 1 f 2 f 3 • x • in. Q • fifo 1 • fifo 2 • out. Q Elastic: A pipeline stage can process data if its input FIFO is not empty and output FIFO is not Full Most complex processor pipelines are a combination of the two styles February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -3
Inelastic vs Elastic Pipelines Inelastic pipeline: n n typically one rule; the designer controls precisely which activities go on in parallel downside: The rule can get too complicated -- easy to make mistakes; difficult to make changes Elastic pipeline: n n February 20, 2013 several smaller rules, each easy to write, easier to make changes downside: sometimes rules do not fire concurrently when they should http: //csg. csail. mit. edu/6. 375 L 05 -4
Inelastic pipeline f 0 f 1 f 2 x in. Q s. Reg 1 s. Reg 2 rule sync-pipeline (True); in. Q. deq(); s. Reg 1 <= f 0(in. Q. first()); s. Reg 2 <= f 1(s. Reg 1); out. Q. enq(f 2(s. Reg 2)); endrule This is real IFFT code; just replace f 0, f 1 and f 2 with stage_f code February 20, 2013 out. Q This rule can fire only if - in. Q has an element - out. Q has space Atomicity: Either all or none of the state elements in. Q, out. Q, s. Reg 1 and s. Reg 2 will be updated http: //csg. csail. mit. edu/6. 375 L 05 -5
Inelastic pipeline Making implicit guard conditions explicit f 0 f 1 f 2 x in. Q s. Reg 1 s. Reg 2 out. Q rule sync-pipeline (!in. Q. empty() && !out. Q. full); in. Q. deq(); s. Reg 1 <= f 0(in. Q. first()); s. Reg 2 <= f 1(s. Reg 1); out. Q. enq(f 2(s. Reg 2)); endrule Suppose s. Reg 1 and s. Reg 2 have data, out. Q is not full but in. Q is empty. What behavior do you expect? Leave green and red data in the pipeline? February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -6
Pipeline bubbles f 0 f 1 f 2 x in. Q s. Reg 1 s. Reg 2 rule sync-pipeline (True); in. Q. deq(); s. Reg 1 <= f 0(in. Q. first()); s. Reg 2 <= f 1(s. Reg 1); out. Q. enq(f 2(s. Reg 2)); endrule out. Q Red and Green tokens must move even if there is nothing in in. Q! Also if there is no token in s. Reg 2 then nothing should be enqueued in the out. Q Modify the rule to deal with these conditions February 20, 2013 http: //csg. csail. mit. edu/6. 375 Valid bits or the Maybe type L 05 -7
Explicit encoding of Valid/Invalid data Valid/Invalid f 0 f 1 f 2 x in. Q s. Reg 1 s. Reg 2 out. Q typedef union tagged {void Valid; void Invalid; } Validbit deriving (Eq, Bits); rule sync-pipeline (True); if (in. Q. not. Empty()) begin s. Reg 1 <= f 0(in. Q. first()); in. Q. deq(); s. Reg 1 f <= Valid end else s. Reg 1 f <= Invalid; s. Reg 2 <= f 1(s. Reg 1); s. Reg 2 f <= s. Reg 1 f; if (s. Reg 2 f == Valid) out. Q. enq(f 2(s. Reg 2)); endrule February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -8
When is this rule enabled? rule sync-pipeline (True); if (in. Q. not. Empty()) begin s. Reg 1 <= f 0(in. Q. first()); in. Q. deq(); s. Reg 1 f <= Valid end else s. Reg 1 f <= Invalid; s. Reg 2 <= f 1(s. Reg 1); s. Reg 2 f <= s. Reg 1 f; if (s. Reg 2 f == Valid) out. Q. enq(f 2(s. Reg 2)); endrule in. Q s. Reg 1 f s. Reg 2 f NE NE February 20, 2013 V V I I out. Q NF F in. Q yes No Yes Yes No Yes yes f 0 in. Q s. Reg 1 f s. Reg 2 f E E E E http: //csg. csail. mit. edu/6. 375 V V I I f 1 f 2 s. Reg 2 out. Q NF F Yes 1 = yes but no change yes No Yes Yes No Yes 1 yes L 05 -9
The Maybe type A useful type to capture valid/invalid data typedef union tagged { void Invalid; data_T Valid; } Maybe#(type data_T); data valid/invalid Registers contain Maybe type values Some useful functions on Maybe type: is. Valid(x) returns true if x is Valid from. Maybe(d, x) returns the data value in x if x is Valid the default value d if x is Invalid February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -10
Using the Maybe typedef union tagged { void Invalid; data_T Valid; } Maybe#(type data_T); data valid/invalid Registers contain Maybe type values rule sync-pipeline if (True); if (in. Q. not. Empty()) begin s. Reg 1 <= Valid f 0(in. Q. first()); in. Q. deq(); end else s. Reg 1 <= Invalid; s. Reg 2 <= is. Valid(s. Reg 1)? Valid f 1(from. Maybe(d, s. Reg 1)) : Invalid; if is. Valid(s. Reg 2) out. Q. enq(f 2(from. Maybe(d, s. Reg 2))); endrule February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -11
Pattern-matching: An alternative syntax to extract datastructure components typedef union tagged { void Invalid; data_T Valid; } Maybe#(type data_T); case (m) matches tagged Invalid : return 0; x will get bound to the appropriate tagged Valid. x : return x; part of m endcase if (m matches (Valid. x) &&& (x > 10)) The &&& is a conjunction, and allows pattern-variables to come into scope from left to right February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -12
The Maybe type data in the pipeline typedef union tagged { void Invalid; data_T Valid; } Maybe#(type data_T); data valid/invalid Registers contain Maybe type values rule sync-pipeline if (True); if (in. Q. not. Empty()) begin s. Reg 1 <= Valid (f 0(in. Q. first())); in. Q. deq(); end else s. Reg 1 <= Invalid; case (s. Reg 1) matches tagged Valid. sx 1: s. Reg 2 <= Valid f 1(sx 1); tagged Invalid: s. Reg 2 <= Invalid; endcase (s. Reg 2) matches sx 1 will get bound tagged Valid. sx 2: out. Q. enq(f 2(sx 2)); to the appropriate endcase part of s. Reg 1 endrule February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -13
Generalization: n-stage pipeline f(0) f(1) f(2) . . . f(n-1) x in. Q s. Reg[0] s. Reg[1] s. Reg[n-2] out. Q rule sync-pipeline (True); if (in. Q. not. Empty()) begin s. Reg[0]<= Valid f(1, in. Q. first()); in. Q. deq(); end else s. Reg[0]<= Invalid; for(Integer i = 1; i < n-1; i=i+1) begin case (s. Reg[i-1]) matches tagged Valid. sx: s. Reg[i] <= Valid f(i-1, sx); tagged Invalid: s. Reg[i] <= Invalid; endcase end case (s. Reg[n-2]) matches tagged Valid. sx: out. Q. enq(f(n-1, sx)); endcase endrule February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -14
Elastic pipeline Use FIFOs instead of pipeline registers f 1 f 2 f 3 • x • in. Q • fifo 1 • fifo 2 rule stage 1 if (True); • fifo 1. enq(f 1(in. Q. first()); • in. Q. deq(); endrule stage 2 if (True); fifo 2. enq(f 2(fifo 1. first()); • fifo 1. deq(); endrule stage 3 if (True); • out. Q. enq(f 3(fifo 2. first()); • fifo 2. deq(); endrule February 20, 2013 http: //csg. csail. mit. edu/6. 375 • out. Q What is the firing condition for each rule? Can tokens be left inside the pipeline? No need for Maybe types L 05 -15
Firing conditions for reach rule f 1 f 2 f 3 • x • in. Q NE NE …. fifo 1 NE, NF fifo 2 NE, NF NE, F • fifo 1 • fifo 2 out. Q rule 1 rule 2 rule 3 Yes Yes …. Yes No No Yes No NF F • out. Q This is the first example we have seen where multiple rules may be ready to execute concurrently Can we execute multiple rules together? February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -16
Informal analysis f 1 f 2 f 3 • x • in. Q NE NE …. fifo 1 NE, NF fifo 2 NE, NF NE, F • fifo 1 • fifo 2 out. Q rule 1 rule 2 rule 3 Yes Yes …. Yes No No Yes No NF F • out. Q FIFOs must permit concurrent enq and deq for all three rules to fire concurrently February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -17
Concurrency when the FIFOs do not permit concurrent enq and deq f 1 f 2 f 3 • x in. Q not empty fifo 1 fifo 2 not empty • & not full out. Q not full At best alternate stages in the pipeline will be able to fire concurrently February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -18
Pipelined designs expressed using Multiple rules If rules for different pipeline stages never fire in the same cycle then the design can hardly be called a pipelined design If all the enabled rules fire in parallel every cycle then, in general, wrong results can be produced We need a clean model for concurrent firing of rules February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -19
BSV Rule Execution A BSV program consists of state elements and rules, aka, Guarded Atomic Actions (GAA) that operate on the state elements Application of a rule modifies some state elements of the system in a deterministic manner reg en’s f x current state February 20, 2013 guard AND next state computation next. State http: //csg. csail. mit. edu/6. 375 f x next state values L 05 -20
BSV Execution Model Repeatedly: Select a rule to execute Compute the state updates Make the state updates February 20, 2013 http: //csg. csail. mit. edu/6. 375 Highly nondeterministic User annotations can help in rule selection L 05 -21
One-rule-at-time-semantics The legal behavior of a BSV program can always be explained by observing the state updates obtained by applying only one rule at a time Implementation concern: Schedule multiple rules concurrently without violating one-rule-at-a-time semantics February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -22
Guard lifting For concurrent scheduling of rules, we only need to consider those rules that can be concurrently enabled, i. e. , whose guards are true In order to understand when a rule can be enabled, we need to understand precisely how implicit guards are lifted precisely to form the rule guard February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -23
Making guards explicit rule foo if (True); if (p) fifo. enq(8); r <= 7; endrule foo if ((p && fifo. not. Full) || !p); if (p) fifo. enq(8); r <= 7; endrule Effectively, all implicit conditions (guards) are lifted and conjoined to the rule guard February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -24
Implicit guards (conditions) rule <name> if (<guard>); <action>; endrule <action> : : = | | make implicit guards explicit | | r <= <exp> if (<exp>) <action> ; <action> m. g. B(<exp>) when m. g. G m. g(<exp>) t = <exp> <action> : : = | | | r <= <exp> if (<exp>) <action> when (<exp>) <action> ; <action> m. g. B(<exp>) t = <exp> February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -25
Guards vs If’s A guard on one action of a parallel group of actions affects every action within the group (a 1 when p 1); a 2 ==> (a 1; a 2) when p 1 A condition of a Conditional action only affects the actions within the scope of the conditional action (if (p 1) a 1); a 2 p 1 has no effect on a 2. . . Mixing ifs and whens (if (p) (a 1 when q)) ; a 2 ((if (p) a 1); a 2) when ((p && q) | !p) ((if (p) a 1); a 2) when (q | !p) February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -26
Guard Lifting rules All the guards can be “lifted” to the top of a rule n n n (a 1 when p) ; a 2 a 1 ; (a 2 when p) • (a 1 ; a 2) when p if (p when q) a if (p) (a when q) • (if (p) a) when q • (if (p) a) when (q | !p) (a when p 1) when p 2 x <= (e when p) • a when (p 1 & p 2) • (x <= e) when p similarly for expressions. . . n Rule r (a when p) • Rule r (if (p) a) • Can you prove that using these rules all the guards will be lifted to the top of an action? February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -27
BSV provides a primitive (imp. Cond. Of) to make guards explicit and lift them to the top From now on in concurrency discussions we will assume that all guards have been lifted to the top in every rule February 20, 2013 http: //csg. csail. mit. edu/6. 375 L 05 -28
- Slides: 28