ESE 680 002 ESE 534 Computer Organization Day
- Slides: 71
ESE 680 -002 (ESE 534): Computer Organization Day 19: March 26, 2007 Retime 1: Transformations Penn ESE 680 -002 Fall 2007 -- De. Hon 1
Previously • Reviewed Pipelining – basic assignments on • Saw spatial designs efficient – when reuse logic at maximum frequency • Interconnect is dominant delay – and dominant area – heavy call to reuse to use efficiently Penn ESE 680 -002 Fall 2007 -- De. Hon 2
Today • Systematic transformation for retiming – preserve semantics (meaning) Penn ESE 680 -002 Fall 2007 -- De. Hon 3
Motivation Penn ESE 680 -002 Fall 2007 -- De. Hon 4
Motivation • FPGAs (spatial computing) – run efficiently when all resources reused rapidly • cycle time minimized • “Everything in the right place at the right time. ” Penn ESE 680 -002 Fall 2007 -- De. Hon 5
Motivating Questions • Can I build a fixed-frequency (fixed clock) programmable architecture? • Can I always make a design run at maximum clock rate? • How do we systematically transform any computation to – Operate on fixed-frequency array? – Coordinate around mandatory registers in design? Penn ESE 680 -002 Fall 2007 -- De. Hon 6
Interconnect Retiming • Long Paths Slow • Could limit cycle • Add registers to long distance interconnect – At each switch? – In the middle of long wires? • How justify these registers? Penn ESE 680 -002 Fall 2007 -- De. Hon 7
Day 3 Spatial Quadratic • How do we pipeline a design? Penn ESE 680 -002 Fall 2007 -- De. Hon 8
Day 3 Pipelined Spatial Quadratic Penn ESE 680 -002 Fall 2007 -- De. Hon 9
How do you use? Penn ESE 680 -002 Fall 2007 -- De. Hon 10
Penn ESE 680 -002 Fall 2007 -- De. Hon 11
How do you use? • To compute A*B+C*D+E Penn ESE 680 -002 Fall 2007 -- De. Hon 12
Compute • A*B+C*D+E Penn ESE 680 -002 Fall 2007 -- De. Hon 13
How Compute? • Yi=Yi-1 xor Xi • With pipelined nand 2 gates? Penn ESE 680 -002 Fall 2007 -- De. Hon 14
want have Penn ESE 680 -002 Fall 2007 -- De. Hon 15
Penn ESE 680 -002 Fall 2007 -- De. Hon 16
Retiming Algorithm Penn ESE 680 -002 Fall 2007 -- De. Hon 17
Task • Move registers to: – Preserve semantics – Minimize path length between registers – i. e. Make path length 1 for maximum throughput or reuse – …while minimizing number of registers required Penn ESE 680 -002 Fall 2007 -- De. Hon 18
Simple Example Path Length (L) = 4 Can we do better? Penn ESE 680 -002 Fall 2007 -- De. Hon 19
Legal Register Moves • Retiming Lag/Lead Penn ESE 680 -002 Fall 2007 -- De. Hon 20
Canonical Graph Representation Separate arc for each path Weight edges by number of registers (weight nodes by delay through node) Penn ESE 680 -002 Fall 2007 -- De. Hon 21
Critical Path Length Critical Path: Length of longest path of zero weight nodes Compute in O(|E|) time by levelizing network: Topological sort, push path lengths forward until find register. 22 Penn ESE 680 -002 Fall 2007 -- De. Hon
Retiming Lag/Lead Retiming: Assign a lag to every vertex weight(e ) = weight(e) + lag(head(e))-lag(tail(e)) Penn ESE 680 -002 Fall 2007 -- De. Hon 23
Valid Retiming • Retiming is valid as long as: – e in graph • weight(e ) = weight(e) + lag(head(e))-lag(tail(e)) 0 • Assuming original circuit was a valid synchronous circuit, this guarantees: – non-negative register weights on all edges • no travel backward in time : -) – all cycles have strictly positive register counts – propagation delay on each vertex is non-negative (assumed 1 for today) Penn ESE 680 -002 Fall 2007 -- De. Hon 24
Retiming Task • Move registers assign lags to nodes – lags define all locally legal moves • Preserving non-negative edge weights – (previous slide) – guarantees collection of lags remains consistent globally Penn ESE 680 -002 Fall 2007 -- De. Hon 25
Retiming Transformation • N. B. : unchanged by retiming – number of registers around a cycle – delay along a cycle • Cycle of length P must have – at least P/c registers on it to be retimeable to cycle c – Can be computed from invariant above Penn ESE 680 -002 Fall 2007 -- De. Hon 26
Optimal Retiming • There is a retiming of – graph G – w/ clock cycle c – iff G-1/c has no cycles with negative edge weights • G- subtract from each edge weight Penn ESE 680 -002 Fall 2007 -- De. Hon 27
1/c Intuition • Want to place a register every c delay units • Each register adds one • Each delay subtracts 1/c • As long as remains more positives than negatives around all cycles – can move registers to accommodate – Captures the regs=P/c constraints Penn ESE 680 -002 Fall 2007 -- De. Hon 28
G-1/c Penn ESE 680 -002 Fall 2007 -- De. Hon 29
Compute Retiming • Lag(v) = shortest path to I/O in G-1/c • Compute shortest paths in O(|V||E|) – Bellman-Ford – also use to detect negative weight cycles when c too small Penn ESE 680 -002 Fall 2007 -- De. Hon 30
Bellman Ford • For I 0 to N – ui (except ui=0 for IO) • For k 0 to N – for ei, j E • ui min(ui , uj+w(ei, j)) • For ei, j E //still update negative cycle • if ui >uj+w(ei, j) – cycles detected Penn ESE 680 -002 Fall 2007 -- De. Hon 31
Apply to Example Penn ESE 680 -002 Fall 2007 -- De. Hon 32
Try c=1 Penn ESE 680 -002 Fall 2007 -- De. Hon 33
Apply: Find Lags Negative weight cycles? Shortest paths? Penn ESE 680 -002 Fall 2007 -- De. Hon 34
Apply: Lags Penn ESE 680 -002 Fall 2007 -- De. Hon 35
Apply: Move Registers Animation Seq. #’s 1 2 1 1 4 1 3 1 1 2 1 weight(e ) = weight(e) + lag(head(e))-lag(tail(e)) Penn ESE 680 -002 Fall 2007 -- De. Hon 36
Apply: Retimed Penn ESE 680 -002 Fall 2007 -- De. Hon 37
Apply: Retimed Design Penn ESE 680 -002 Fall 2007 -- De. Hon 38
Revise Example (fanout delay) Penn ESE 680 -002 Fall 2007 -- De. Hon 39
Revised: Graph Penn ESE 680 -002 Fall 2007 -- De. Hon 40
Revised: Graph Penn ESE 680 -002 Fall 2007 -- De. Hon 41
Revised: C=1? Penn ESE 680 -002 Fall 2007 -- De. Hon 42
Revised: C=2? Penn ESE 680 -002 Fall 2007 -- De. Hon 43
Revised: Lag Penn ESE 680 -002 Fall 2007 -- De. Hon 44
Revised: Lag Take ceiling to convert to integer lags: 0 -1 0 Penn ESE 680 -002 Fall 2007 -- De. Hon 45
Revised: Apply Lag 0 -1 -1 0 Penn ESE 680 -002 Fall 2007 -- De. Hon 46
Revised: Apply Lag 0 -1 -1 0 1 1 2 1 3 12 0 1 13 0 1 11 Penn ESE 680 -002 Fall 2007 -- De. Hon 0 4 10 0 9 1 1 8 6 1 0 5 1 7 47
Revised: Retimed 1 0 1 Penn ESE 680 -002 Fall 2007 -- De. Hon 0 1 1 0 0 1 1 48
Pipelining • We can use this retiming to pipeline • Assume we have enough (infinite supply) registers at edge of circuit • Retime them into circuit Penn ESE 680 -002 Fall 2007 -- De. Hon 49
C>1 ==> Pipeline Penn ESE 680 -002 Fall 2007 -- De. Hon 50
Add Registers G n 1 0 0 0 Penn ESE 680 -002 Fall 2007 -- De. Hon 1 1 1 0 0 1 51
Add Registers n G 1 1 1 0 0 0 1 0 G-1/1 Penn ESE 680 -002 Fall 2007 -- De. Hon 52
Pipeline Retiming: Lag Penn ESE 680 -002 Fall 2007 -- De. Hon 53
Pipelined Retimed Penn ESE 680 -002 Fall 2007 -- De. Hon 54
Real Cycle Penn ESE 680 -002 Fall 2007 -- De. Hon 55
Real Cycle Penn ESE 680 -002 Fall 2007 -- De. Hon 56
Cycle C=1? Penn ESE 680 -002 Fall 2007 -- De. Hon 57
Cycle C=2? Penn ESE 680 -002 Fall 2007 -- De. Hon 58
Cycle: C-slow Cycle=c C-slow network has Cycle=1 Penn ESE 680 -002 Fall 2007 -- De. Hon 59
2 -slow Cycle C=1 Penn ESE 680 -002 Fall 2007 -- De. Hon 60
2 -Slow Lags Penn ESE 680 -002 Fall 2007 -- De. Hon 61
2 -Slow Retime Penn ESE 680 -002 Fall 2007 -- De. Hon 62
Retimed 2 -Slow Cycle Penn ESE 680 -002 Fall 2007 -- De. Hon 63
C-Slow applicable? • Available parallelism – solve C identical, independent problems • • Data-level parallelism e. g. process packets (blocks) separately e. g. independent regions in images Commutative operators – e. g. max example Penn ESE 680 -002 Fall 2007 -- De. Hon 64
Max Example Penn ESE 680 -002 Fall 2007 -- De. Hon 65
Max Example Penn ESE 680 -002 Fall 2007 -- De. Hon 66
HSRA Retiming • HSRA – adds mandatory pipelining to interconnect • One additional twist – long, pipelined interconnect • need more than one register on paths Penn ESE 680 -002 Fall 2007 -- De. Hon 67
Accommodating HSRA Interconnect Delays • Add buffers to LUT path to match interconnect register requirements • Retime to C=1 as before • Buffer chains force enough registers to cover interconnect delays Penn ESE 680 -002 Fall 2007 -- De. Hon 68
Accommodating HSRA Interconnect Delays Penn ESE 680 -002 Fall 2007 -- De. Hon 69
Admin • Retiming Assignment Due Wed. • Reading for today includes retiming algorithm – (handed out last week) • Retiming Structures on Wed. – (swap from original syllabus) Penn ESE 680 -002 Fall 2007 -- De. Hon 70
Big Ideas [MSB Ideas] • Retiming transformations important to – minimize cycles – efficiently utilize spatial architectures • Optimally solvable in O(|V||E|) time • Tells us – pipelining required – C-slow – where to move registers • Can accommodate mandatory delays Penn ESE 680 -002 Fall 2007 -- De. Hon 71
- Ese 680
- Ese 680
- Ese 680
- Day 1 day 2 day 3 day 4
- Ntp 534
- Day 1 day 2 day 817
- Process organization in computer organization
- Altair 680
- Talk 680
- F tag 689
- Bme 680
- Nur 680
- Christina corrigan
- 44 word form
- A 680 newton student runs up a flight of stairs
- Basic structure of a computer
- Difference between computer organisation and architecture
- Basic computer organization and design
- Flow chart for interrupt cycle
- C++ mfc 예제
- Norsok z-001
- Semt.001
- Nrg lu 002
- Cip 002-009
- Gmas-002
- Site structure
- 001 002 003
- Youtube
- What is cutting speed in milling
- 002
- 002
- Um objeto com massa de 10kg e volume de 0 002
- 002
- Flowchart elevator
- Um objeto com massa de 10kg e volume de 0 002
- 002
- Faculty marshall usc advertising csv
- 002
- Semt.002
- Confirm physical health status
- Um objeto com massa de 10kg e volume de 0 002
- Dtu p 06-002
- 0,05/0,002
- 001 002 003
- Block organization and point by point organization
- Family sis schoolmax
- Oceans apart day after day
- Day to day maintenance
- Physical science chapter 6 review answers
- I don't know about tomorrow i just live from day to day
- Romeo and juliet timeline day by day
- Growing day by day
- Seed germination inhibitors examples
- Day by day seed germination observation chart
- Observation of plant growth day by day
- I live for jesus day after day
- Rising he justified freely forever
- Day one day one noodle ss2
- Day one day one ss2
- Computer organization and architecture 10th solution
- Virtual lab for computer organization and architecture
- Introduction to computer organization and architecture
- What is nano programming
- Bus design in computer architecture
- Accessing io devices in computer organization
- Data representation in computer architecture
- Basic organization of computer
- Single bus structure in computer organization
- Computer organization course
- Memory data register
- Multiple bus structure
- Computer organization & architecture: themes and variations