CS 184 a Computer Architecture Structures and Organization

  • Slides: 50
Download presentation
CS 184 a: Computer Architecture (Structures and Organization) Day 11: October 30, 2000 Interconnect

CS 184 a: Computer Architecture (Structures and Organization) Day 11: October 30, 2000 Interconnect Requirements Caltech CS 184 a Fall 2000 -- De. Hon 1

Last Time • Saw various compute blocks • Role of automated mapping in exploring

Last Time • Saw various compute blocks • Role of automated mapping in exploring design space • To exploit structure in typical designs we need programmable interconnect • All reasonable, scalable structures: – small to moderate sized logic blocks – connected via programmable interconnect • been saying delay across programmable Caltech CS 184 a Fall 2000 -- De. Hon interconnect is a big factor 2

Today • Interconnect Design Space • Dominance of Interconnect • Simple things – and

Today • Interconnect Design Space • Dominance of Interconnect • Simple things – and why they don’t work • Interconnect Implications • Characterizing Interconnect Requirements Caltech CS 184 a Fall 2000 -- De. Hon 3

Dominant Area Caltech CS 184 a Fall 2000 -- De. Hon 4

Dominant Area Caltech CS 184 a Fall 2000 -- De. Hon 4

Dominant Time Caltech CS 184 a Fall 2000 -- De. Hon 5

Dominant Time Caltech CS 184 a Fall 2000 -- De. Hon 5

Dominant Time Caltech CS 184 a Fall 2000 -- De. Hon 6

Dominant Time Caltech CS 184 a Fall 2000 -- De. Hon 6

Dominant Power XC 4003 A data from Eric Kusse (UCB MS 1997) Caltech CS

Dominant Power XC 4003 A data from Eric Kusse (UCB MS 1997) Caltech CS 184 a Fall 2000 -- De. Hon 7

For Spatial Architectures • Interconnect dominant – area – power – time • …so

For Spatial Architectures • Interconnect dominant – area – power – time • …so need to understand in order to optimize architectures Caltech CS 184 a Fall 2000 -- De. Hon 8

Interconnect • Problem – Thousands of independent (bit) operators producing results • true of

Interconnect • Problem – Thousands of independent (bit) operators producing results • true of FPGAs today • …true for *LIW, multi-u. P, etc. in future – Each taking as inputs the results of other (bit) processing elements – Interconnect is late bound • don’t know until after fabrication Caltech CS 184 a Fall 2000 -- De. Hon 9

Design Issues • Flexibility -- route “anything” – (w/in reason? ) • • Area

Design Issues • Flexibility -- route “anything” – (w/in reason? ) • • Area -- wires, switches Delay -- switches in path, stubs, wire length Power -- switch, wire capacitance Routability -- computational difficulty finding routes Caltech CS 184 a Fall 2000 -- De. Hon 10

(1) Shared Bus • Familiar case • Use single interconnect resource • Reuse in

(1) Shared Bus • Familiar case • Use single interconnect resource • Reuse in Time • Consequence? Caltech CS 184 a Fall 2000 -- De. Hon 11

Shared Bus • Consider operation: y=Ax 2 +Bx +C – 3 mpys – 2

Shared Bus • Consider operation: y=Ax 2 +Bx +C – 3 mpys – 2 adds – ~5 values need to be routed from producer to consumer • Performance lower bound if have design w/: – m multipliers – u madd units – a adders i simultaneous interconnection busses Caltech CS 184 a–Fall 2000 -- De. Hon 12

Viewpoint • Interconnect is a resource • Bottleneck for design can be in availability

Viewpoint • Interconnect is a resource • Bottleneck for design can be in availability of any resource • Lower Bound on Delay: Logical Resource / Physical Resources • May be worse – dependencies – ability to use resource Caltech CS 184 a Fall 2000 -- De. Hon 13

Shared Bus • Flexibility (+) – routes everything (given enough time) – can be

Shared Bus • Flexibility (+) – routes everything (given enough time) – can be trick to schedule use optimally • Area (++) – kn switches – O(n) • Delay (Power) (--) – – – wire length O(kn) parasitic stubs: kn+n series switch: 1 O(kn) sequentialize I/B Caltech CS 184 a Fall 2000 -- De. Hon 14

Term: Bisection Bandwidth • Partition design into two equal size halves • Minimize wires

Term: Bisection Bandwidth • Partition design into two equal size halves • Minimize wires (nets) with ends in both halves • Number of wires crossing is bisection bandwidth Caltech CS 184 a Fall 2000 -- De. Hon 15

(2) Crossbar • Avoid bottleneck • Every output gets its own interconnect channel Caltech

(2) Crossbar • Avoid bottleneck • Every output gets its own interconnect channel Caltech CS 184 a Fall 2000 -- De. Hon 16

Crossbar Caltech CS 184 a Fall 2000 -- De. Hon 17

Crossbar Caltech CS 184 a Fall 2000 -- De. Hon 17

Crossbar Caltech CS 184 a Fall 2000 -- De. Hon 18

Crossbar Caltech CS 184 a Fall 2000 -- De. Hon 18

Crossbar • Flexibility (++) – routes everything (guaranteed) • Delay (Power) (-) – –

Crossbar • Flexibility (++) – routes everything (guaranteed) • Delay (Power) (-) – – • Area (-) – Bisection bandwidth n – kn 2 switches – O(n 2) wire length O(kn) parasitic stubs: kn+n series switch: 1 O(kn) Caltech CS 184 a Fall 2000 -- De. Hon 19

Crossbar • Too expensive – Switch Area = k*n 2*2. 5 Kl 2 –

Crossbar • Too expensive – Switch Area = k*n 2*2. 5 Kl 2 – Switch Area/LUT = k*n* 2. 5 Kl 2 – n=1024, k=4 => 10 M l 2 • What can we do? Caltech CS 184 a Fall 2000 -- De. Hon 20

Avoiding Crossbar Costs • Typical architecture trick: – exploit expected problem structure Caltech CS

Avoiding Crossbar Costs • Typical architecture trick: – exploit expected problem structure Caltech CS 184 a Fall 2000 -- De. Hon 21

Avoiding Crossbar Costs • Typical architecture trick: – exploit expected problem structure • We

Avoiding Crossbar Costs • Typical architecture trick: – exploit expected problem structure • We have freedom in operator placement • Designs have spatial locality • =>place connected components “close” together – don’t need full interconnect? Caltech CS 184 a Fall 2000 -- De. Hon 22

Exploit Locality • • Wires expensive Local interconnect cheap 1 D versions? (explore on

Exploit Locality • • Wires expensive Local interconnect cheap 1 D versions? (explore on hmwrk) Caltech CS 184 a Fall 2000 -- De. Hon 23

Exploit Locality • • Wires expensive Local interconnect cheap Use 2 D to make

Exploit Locality • • Wires expensive Local interconnect cheap Use 2 D to make more things closer Mesh? Caltech CS 184 a Fall 2000 -- De. Hon 24

Mesh Analysis • Can we place everything close? Caltech CS 184 a Fall 2000

Mesh Analysis • Can we place everything close? Caltech CS 184 a Fall 2000 -- De. Hon 25

Mesh “Closeness” • Try placing “everything” close Caltech CS 184 a Fall 2000 --

Mesh “Closeness” • Try placing “everything” close Caltech CS 184 a Fall 2000 -- De. Hon 26

Mesh Analysis • Flexibility - ? – Ok w/ large w • Delay (Power)

Mesh Analysis • Flexibility - ? – Ok w/ large w • Delay (Power) – Series switches • 1 -- n • Area – – Bisection BW -- w n Switches -- O(nw) O(w 2 n) [linear pop] larger on homework – Wire length • w-- n – Stubs • O(w)--O(w n) Caltech CS 184 a Fall 2000 -- De. Hon 27

Mesh • Plausible • …but What’s w • …and how does it grow? Caltech

Mesh • Plausible • …but What’s w • …and how does it grow? Caltech CS 184 a Fall 2000 -- De. Hon 28

Characterize Locality • Want to exploit locality • How much locality do we have?

Characterize Locality • Want to exploit locality • How much locality do we have? • Impact on resources required? Caltech CS 184 a Fall 2000 -- De. Hon 29

Bisection Bandwidth • Bisect design • Bisection bandwidth of design – => lower bound

Bisection Bandwidth • Bisect design • Bisection bandwidth of design – => lower bound on network bisection bandwidth • Design with more locality – => lower bisection bandwidth N/2 cutsize • Enough? Caltech CS 184 a Fall 2000 -- De. Hon N/2 30

Characterizing Locality • Single cut not capture locality within halves • Cut again –

Characterizing Locality • Single cut not capture locality within halves • Cut again – => recursive bisection Caltech CS 184 a Fall 2000 -- De. Hon 31

Regularizing Growth • How do bisection bandwidths shrink (grow) at different levels of bisection

Regularizing Growth • How do bisection bandwidths shrink (grow) at different levels of bisection hierarchy? • Basic assumption: Geometric – 1/ 2 Caltech CS 184 a Fall 2000 -- De. Hon 32

Geometric Growth • (F, )-bifurcator – F bandwidth at root – geometric regression at

Geometric Growth • (F, )-bifurcator – F bandwidth at root – geometric regression at each level Caltech CS 184 a Fall 2000 -- De. Hon 33

Good Model? Log-log plot ==> straight lines represent geometric growth Caltech CS 184 a

Good Model? Log-log plot ==> straight lines represent geometric growth Caltech CS 184 a Fall 2000 -- De. Hon 34

Rent’s Rule • Long standing empirical relationship – IO = C*NP – 0 P

Rent’s Rule • Long standing empirical relationship – IO = C*NP – 0 P 1. 0 – compare (F, )-bifurcator • = 2 P • Captures notion of locality – some signals generated and consumed locally – reconvergent fanout Caltech CS 184 a Fall 2000 -- De. Hon 35

Monday class stopped here Caltech CS 184 a Fall 2000 -- De. Hon 36

Monday class stopped here Caltech CS 184 a Fall 2000 -- De. Hon 36

Rent’s Rule • Typically consider – 0. 5 P 0. 75 • “High-Speed” Logic

Rent’s Rule • Typically consider – 0. 5 P 0. 75 • “High-Speed” Logic P=0. 67 • Memory (P~0. 1 -0. 2) • Example (i 10) – max C=7, P=0. 68 – avg C=5, P=0. 72 Caltech CS 184 a Fall 2000 -- De. Hon 37

What tell us about design? • Recursive bandwidth requirements in network Caltech CS 184

What tell us about design? • Recursive bandwidth requirements in network Caltech CS 184 a Fall 2000 -- De. Hon 38

What tell us about design? • Recursive bandwidth requirements in network – lower bound

What tell us about design? • Recursive bandwidth requirements in network – lower bound on resource requirements • N. B. necessary but not sufficient condition on network design – I. e. design must also be able to use the wires Caltech CS 184 a Fall 2000 -- De. Hon 39

What tell us about design? • Interconnect lengths – Intuition • if p>0. 5,

What tell us about design? • Interconnect lengths – Intuition • if p>0. 5, everything cannot be nearest neighbor • as p grows, so wire distances Caltech CS 184 a Fall 2000 -- De. Hon 40

What tell us about design? • Interconnect lengths – IO=(n 2)P cross distance n

What tell us about design? • Interconnect lengths – IO=(n 2)P cross distance n – d. IO/dn end at exactly distance n – E(l)=Integral 0 to n= N • of n*(d. IO/dn)/n 2 • assume iid sources – E(l)=O(N(p-0. 5)) • p>0. 5 Caltech CS 184 a Fall 2000 -- De. Hon 41

What Tell us about design? • IO NP • Bisection BW NP • side

What Tell us about design? • IO NP • Bisection BW NP • side length NP – N if p<0. 5 • Area N 2 p – p>0. 5 N. B. 2 D VLSI world has “natural” Rent of P=0. 5 (area vs. perimeter) Caltech CS 184 a Fall 2000 -- De. Hon 42

Rent’s Rule Caveats • Modern “systems” on a chip -- likely to contain subcomponents

Rent’s Rule Caveats • Modern “systems” on a chip -- likely to contain subcomponents of varying Rent complexity • Less I/O at certain “natural” boundaries • System close – (Rent’s Rule apply to workstation, PC, PDA? ) Caltech CS 184 a Fall 2000 -- De. Hon 43

Area/Wire Length • Bad news – Area ~ O(N 2 p) • faster than

Area/Wire Length • Bad news – Area ~ O(N 2 p) • faster than N – Avg. Wire Length ~ O(N(p-0. 5)) • grows with N • Can designers/CAD control p (locality) once appreciate its effects? • I. e. maybe this cost changes design style/criteria so we mitigate effects? Caltech CS 184 a Fall 2000 -- De. Hon 44

What Rent didn’t tell us • Bisection bandwidth purely geometrical • No constraint for

What Rent didn’t tell us • Bisection bandwidth purely geometrical • No constraint for delay – I. e. a partition may leave critical path weaving between halves Caltech CS 184 a Fall 2000 -- De. Hon 45

Critical Path and Bisection Minimum cut may cross critical path multiple times. Minimizing long

Critical Path and Bisection Minimum cut may cross critical path multiple times. Minimizing long wires in critical path => increase cut size. Caltech CS 184 a Fall 2000 -- De. Hon 46

Rent Weakness • Not account for path topology • ? Can we define a

Rent Weakness • Not account for path topology • ? Can we define a “Temporal” Rent which takes into consideration? – Promising research topic Caltech CS 184 a Fall 2000 -- De. Hon 47

Finishing Up. . . Caltech CS 184 a Fall 2000 -- De. Hon 48

Finishing Up. . . Caltech CS 184 a Fall 2000 -- De. Hon 48

Big Ideas [MSB Ideas] • Interconnect Dominant – power, delay, area • • Can

Big Ideas [MSB Ideas] • Interconnect Dominant – power, delay, area • • Can be bottleneck for designs Can’t afford full crossbar Need to exploit locality Can’t have everything close Caltech CS 184 a Fall 2000 -- De. Hon 49

Big Ideas [MSB Ideas] • Rent’s rule characterize locality • => Area growth O(N

Big Ideas [MSB Ideas] • Rent’s rule characterize locality • => Area growth O(N 2 p) • p>0. 5 => interconnect growing faster than compute elements – expect interconnect to dominate other resources Caltech CS 184 a Fall 2000 -- De. Hon 50