ESE 532 SystemonaChip Architecture Day 3 January 23
- Slides: 68
ESE 532: System-on-a-Chip Architecture Day 3: January 23, 2017 Parallelism Overview Penn ESE 532 Spring 2017 -- De. Hon 1
Today • • Parallelism in Tasks Types of Parallelism Compute Models System Architectures Penn ESE 532 Spring 2017 -- De. Hon 2
Message • Many useful models for parallelism – Help conceptualize • One-size does not fill all – But maybe 6— 10 do? – Match to problem Penn ESE 532 Spring 2017 -- De. Hon 3
Preclass 1 • How do 6 people collaborate on sphere building? Penn ESE 532 Spring 2017 -- De. Hon 4
Preclass 2 • How do 12 people collaborate on sphere building? Penn ESE 532 Spring 2017 -- De. Hon 5
Preclass 3 • How do 6 people collaborate on building 3 spheres? • (alternate solution? ) Penn ESE 532 Spring 2017 -- De. Hon 6
In Class Exercise • Distribute 24 piece sets for building Red and Yellow Sphere – [if have more than 24 people, have pairs build a different model] • Follow instructions from slides to come Penn ESE 532 Spring 2017 -- De. Hon 7
Step 1: Build half of L 1 Penn ESE 532 Spring 2017 -- De. Hon 8
Step 2: Build half of L 2 Penn ESE 532 Spring 2017 -- De. Hon 9
Step 3: • Pass half to builder with 2 x 2 plate Penn ESE 532 Spring 2017 -- De. Hon 10
Step 4: Build L 3 Penn ESE 532 Spring 2017 -- De. Hon 11
Step 5: Build L 5 (ends) (if have pieces) Penn ESE 532 Spring 2017 -- De. Hon 12
Step 6: • Pass both “L 5: ends” to builder with side Penn ESE 532 Spring 2017 -- De. Hon 13
Step 7: half of L 7 Install one side Penn ESE 532 Spring 2017 -- De. Hon 14
Step 8: • Pass assembly to builder with unused side Penn ESE 532 Spring 2017 -- De. Hon 15
Step 9: finish L 7 Penn ESE 532 Spring 2017 -- De. Hon 16
Step 10: • Pass assemble to builder with unused side Penn ESE 532 Spring 2017 -- De. Hon 17
Step 11: add 3 rd side Penn ESE 532 Spring 2017 -- De. Hon 18
Step 12: • Pass assemble to builder with unused side Penn ESE 532 Spring 2017 -- De. Hon 19
Step 13: add final side Penn ESE 532 Spring 2017 -- De. Hon 20
Finish • Check status of all builds Penn ESE 532 Spring 2017 -- De. Hon 21
Types of Parallelism Penn ESE 532 Spring 2017 -- De. Hon 22
Types of Parallelism • What kind of parallelism did we see for steps 1— 3? Penn ESE 532 Spring 2017 -- De. Hon 23
Types of Parallelism • What parallelism when some folks built different model? Penn ESE 532 Spring 2017 -- De. Hon 24
Types of Parallelism • What could we build independently here? • Kind of parallelism? Penn ESE 532 Spring 2017 -- De. Hon 25
Type of Parallelism • Latency multiply = 1 • Latency add = 1 • (different Day 2) cycle mpy 1 B, x 2 x, x 3 A, x 2 4 Penn ESE 532 Spring 2017 -- De. Hon Kind of Parallelism? add (Bx)+C Ax 2+(Bx+C) 26
Types of Parallelism • Data Level – Perform same computation on different data items • Thread or Task Level – Perform separable (perhaps heterogeneous) tasks independently • Instruction Level – Within a single sequential thread, perform multiple operations on each cycle. Penn ESE 532 Spring 2017 -- De. Hon 27
Parallel Compute Models Penn ESE 532 Spring 2017 -- De. Hon 28
Sequential Control Flow Model of correctness Control flow is sequential • Program is a execution sequence of Examples operations C (Java, …) • Operation reads inputs and writes FSM / FA outputs into common store • One operation runs at a time – defines successor Penn ESE 535 Spring 2015 -- De. Hon 29
Parallelism can be explicit • Sphere Build example Step 2 • Coordinate data parallel operations • Multiply, add for quadratic equation cycle mpy 1 B, x 2 x, x 3 A, x 2 4 add (Bx)+C Ax 2+(Bx+C) • Coordinate ILP Penn ESE 532 Spring 2017 -- De. Hon 30
Parallelism can be implicit • Sequential expression • Infer data dependencies T 1=x*x T 2=A*T 1 T 3=B*x T 4=T 2+T 3 Y=C+T 4 • Or Y=A*x*x+B*x+C Penn ESE 532 Spring 2017 -- De. Hon 31
Implicit Parallelism • d=(x 1 -x 2)*(x 1 -x 2) + (y 1 -y 2)*(y 1 -y 2) • What parallelism exists here? Penn ESE 532 Spring 2017 -- De. Hon 32
Parallelism can be implicit • Sequential expression • Infer data dependencies Penn ESE 532 Spring 2017 -- De. Hon for (i=0; i<100; i++) y[i]=A*x[i]+B*x[i]+C Why can these operations be performed in parallel? 33
Term: Operation • Operation – logic computation to be performed Penn ESE 535 Spring 2015 -- De. Hon 34
Dataflow / Control Flow Control flow (e. g. C) Dataflow • Program is a graph • Program is a sequence of of operations • Operation consumes • Operation reads tokens and inputs and writes produces tokens outputs into common • All operations run store concurrently • One operation runs at a time – defines successor Penn ESE 535 Spring 2015 -- De. Hon 35
Token • Data value with presence indication – May be conceptual • Only exist in high-level model • Not kept around at runtime – Or may be physically represented • One bit represents presence/absence of data Penn ESE 535 Spring 2015 -- De. Hon 36
Token Examples? • What are familiar cases where data may come with presence tokens? – Network packets – Memory references from processor • Variable latency depending on cache presence – Start bit on serial communication Penn ESE 535 Spring 2015 -- De. Hon 37
Operation • Takes in one or more inputs • Computes on the inputs • Produces results • Logically self-timed – “Fires” only when input set present – Signals availability of output Penn ESE 535 Spring 2015 -- De. Hon 38
Penn ESE 535 Spring 2015 -- De. Hon 39
Dataflow Graph • Represents – computation sub-blocks – linkage • Abstractly – controlled by data presence Penn ESE 535 Spring 2015 -- De. Hon 40
Dataflow Graph Example Penn ESE 535 Spring 2015 -- De. Hon 41
Sequential / FSM • FSM is degenerate dataflow graph where there is exactly one token S 1 cycle mpy S 1 B, x S 2 x, x S 3 A, x 2 S 4 add next x-->S 2, else S 1 (Bx)+C S 2 S 3 S 4 Ax 2+(Bx+C) S 1 S 3 S 4 Penn ESE 532 Spring 2017 -- De. Hon x not present? 42
Sequential / FSM • FSM is degenerate dataflow graph where there is exactly one token S 1 cycle mpy S 1 B, x S 2 x, x S 3 A, x 2 S 4 add next S 2 x-->S 2, else S 1 (Bx)+C S 3 S 4 Ax 2+(Bx+C) Penn ESE 532 Spring 2017 -- De. Hon S 1 S 4 43
Communicating Threads • Computation is a collection of sequential/control-flow “threads” • Threads may communicate – Through dataflow I/O – (Through shared variables) • View as hybrid or generalization • CSP – Communicating Sequential Processes canonical model example Penn ESE 532 Spring 2017 -- De. Hon 44
Video Decode Audio Sync to HDMI Parse Video • Why might need to synchronize to send to HDMI? Penn ESE 532 Spring 2017 -- De. Hon 45
Compute Models Penn ESE 532 Spring 2017 -- De. Hon 46
System Architectures Penn ESE 532 Spring 2017 -- De. Hon 47
System Architecture Hypothesis • There a small number of useful system architectures • These architectures – Give guidance for organizing resources – Make manageable – Allow share lessons between applications – Provide basis for scalability – Point toward efficient solutions FPT Tutorial: De. Hon 2005 48
Unconstrained Model • Multithreaded programming (equivalently Communicating Sequential Processes) – – Application is collection of threads Communicate with each other May or may not have shared memory Programmer responsible for • • Synchronization Parallelism Data layout Communications… FPT Tutorial: De. Hon 2005 49
Architectural Restrictions • Sequential Control – Data Parallel all parallel processing does the same thing – Lock-Step all parallel processing does different things at synchronized time (e. g. VLIW) – Bulk Synchronous periodic barrier synchronization – Instruction Augmentation – control accelerators from seq. instruction stream 50 FPT Tutorial: De. Hon 2005
Very Long Instruction Word (VLIW) Penn ESE 532 Spring 2017 -- De. Hon 51
Very Long Instruction Word (VLIW) cycle 1 B, x 2 x, x 3 A, x 2 4 Penn ESE 532 Spring 2017 -- De. Hon mpy add (Bx)+C Ax 2+(Bx+C) 52
Instruction Augmentation Co-Processor Penn ESE 532 Spring 2017 -- De. Hon 53
Architectural Restrictions (2) • Dataflow interactions – Allow multithreaded operation – Use data presence for synchronization • E. g. – Pipe-and-filter / Streaming Dataflow – Synchronous Dataflow (SDF) FPT Tutorial: De. Hon 2005 54
Producer-Consumer Parallelism Stock predictions encrypt • Can run concurrently • Just let consumer know when producer sending data Penn ESE 535 Spring 2015 -- De. Hon 55
Pipeline Parallelism ME DCT VQ code • Can potentially all run in parallel • Like physical pipeline • Useful to think about stream of data between operators Penn ESE 535 Spring 2015 -- De. Hon 56
Architectural Restrictions (3) • Regular Communication Patterns – Systolic – Cellular Automata regular grid of homogeneous FSMs FPT Tutorial: De. Hon 2005 57
Architectural Restrictions (4) • Memory/Data Centric – Computation is collection of objects in memory – Each object triggered by input changes – Compute and potentially trigger other objects • E. g. – Repository models – Graph. Step – App: network flow, routing… FPT Tutorial: De. Hon 2005 58
Work Farm • Central controller farms out work Penn ESE 532 Spring 2017 -- De. Hon 59
System Architecture Taxonomy Penn ESE 532 Spring 2017 -- De. Hon 60
System Architecture Taxonomy • Further down the hierarchy – More restricted the model + More guidance provided + More efficient potential implementation + More amenable to analysis • tools and optimizations • Restrictions provide power FPT Tutorial: De. Hon 2005 61
System Architecture Taxonomy • Further down the hierarchy – + + + More restricted the model More guidance provided More efficient potential implementation More amenable to analysis • tools and optimizations • Restrictions provide power FPT Tutorial: De. Hon 2005 62
Value of Multiple Architectures • When you have a big enough hammer, everything looks like a nail. • Many stuck on single model – Try to make all problems look like their nail • Value to diversity / heterogeneity – One size does not fit all Penn ESE 532 Spring 2017 -- De. Hon 63
System Architecture Hypothesis • There a small number of useful system architectures • These architectures – Give guidance for organizing resources – Make manageable – Allow share lessons between applications – Provide basis for scalability – Point toward efficient solutions FPT Tutorial: De. Hon 2005 64
System Architectures Penn ESE 532 Spring 2017 -- De. Hon 65
Model Architecture not 1: 1 Penn ESE 532 Spring 2017 -- De. Hon 66
Big Ideas • Many parallel compute models – Sequential, Dataflow, CSP • Useful System Architectures – Streaming Dataflow, VLIW, co-processor, work farm, SIMD, Vector, CA, FSMD, … • Find natural parallelism in problem • Mix-and-match Penn ESE 532 Spring 2017 -- De. Hon 67
Admin • HW 1 FAQ – roundup of problems and solutions • Reading for Day 4 on web • Talk on Thursday by Ed Lee (UCB) – 3 pm in Wu and Chen • HW 2 due Friday Penn ESE 532 Spring 2017 -- De. Hon 68
- Ese 532
- Ese 532
- Ese 532
- Unrollk
- Ese 532
- Ese 532
- Ese 532
- Ese 532
- Day 1 day 2 day 3 day 4
- Day 1 day 2 day 817
- Number system conversion exercises
- 532
- Sda hymn
- William beanes elementary
- Oceans apart day after
- Day to day maintenance
- Physical science chapter 6 review answers
- I don't know about tomorrow
- Timeline of romeo and juliet
- Growing day by day
- Observation of seed germination day by day
- Seed germination conclusion
- Geotropism
- I live for jesus day after day
- One day casting crowns
- Day one day one noodle ss2
- Tactique futsal
- Teksti argumentues test
- Ese mbi ndryshimet ne sjelljen konsumatore
- E drejta per respektimin e jetes private
- Dramat e shekspirit
- Lufta e hitlerit
- Ferri jane te tjeret ese
- Fragmento que es
- Ese 605 upenn
- Ese 370
- Ese 370
- Ese 370
- Ese 370
- Exemplos de frases conotativas
- Stilet e lidershipit
- Currency exchange rate definition
- Ese
- What is project duration
- Ese
- Ese
- Ese
- Ese
- Ese
- Ese 370
- Ese 370
- Determinantes demostrativos:
- Ese status
- Ese exchange
- Ese 370
- Ese 370
- Ese 22
- Recuerdas aquel dia pues desde ese dia
- La verdad yo no comparto ese desprecio a los nuevos ricos
- Ese 680
- Como te has sentido en ese momento
- En ese momento preterite or imperfect
- Qué ha sido escrito este texto
- Este hombre del casino provinciano
- Ese 680
- Ese 572
- Gate ese
- Eme a ere
- Un ejemplo de oración es “deseo que ganes ese premio”.