Taxonomy 2 MIMD Multiprocessor shared memory P 1

  • Slides: 55
Download presentation

Taxonomy 2

Taxonomy 2

MIMD Multiprocessor (shared memory) P 1 P 2 Pn Interconnection Network IN M 1

MIMD Multiprocessor (shared memory) P 1 P 2 Pn Interconnection Network IN M 1 M 2 Processors Mn Memory modules (Tightly Coupled Architecture) 3 •

Shared Memory • • • Uniform Memory Access (UMA) • Tightly Coupled system Non-Uniform

Shared Memory • • • Uniform Memory Access (UMA) • Tightly Coupled system Non-Uniform Memory Access (NUMA) • • • Loosely Coupled system Cedar from University of Illinois BBN Butterfly Cache Only Memory Access (COMA) • • Using global distributed caches Kendal Square Research-1 (KSR-1) 4 4

MIMD (cont. ) Global Memory GM 1 Global Memory GM 2 GMn Global Interconnection

MIMD (cont. ) Global Memory GM 1 Global Memory GM 2 GMn Global Interconnection Network (Global IN) P 1 P 2 Pn C I N CM 1 P 1 CM 2 P 2 CM 3 Pn (Loosely Coupled Architecture) - Cedar CM 1 C I N CM 2 CM 3 5

MIMD (cont. ) M 2 P 2 Mn Pn Interconnection Network P 1 (IN)

MIMD (cont. ) M 2 P 2 Mn Pn Interconnection Network P 1 (IN) M 1 (Loosely Coupled Architecture) – BBN Butterfly 6

MIMD (cont. ) Interconnection Network (IN) D 1 D 2 Dn C 1 C

MIMD (cont. ) Interconnection Network (IN) D 1 D 2 Dn C 1 C 2 Cn P 1 P 2 Pn (COMA Architecture) 7

MIMD (cont. ) • Multicomputer (Message passing) IN P 1 P 2 Pn M

MIMD (cont. ) • Multicomputer (Message passing) IN P 1 P 2 Pn M 1 M 2 Mn 8

MIMD (cont. ) • Data flow machine • an instruction is ready for execution

MIMD (cont. ) • Data flow machine • an instruction is ready for execution when data for its operands have been made available • • Purely self-contained No program counter 9

SIMD • Array Processor • centralized control unit

SIMD • Array Processor • centralized control unit

MISD • Pipelined vector processor

MISD • Pipelined vector processor

MISD (cont. ) • Systolic array 12

MISD (cont. ) • Systolic array 12

Hybrid Architecture • Combine features of different architectures to provide better performance for parallel

Hybrid Architecture • Combine features of different architectures to provide better performance for parallel computations. • Two type of parallelism • • Control parallelism (MIMD) Data parallelism (SIMD) 13

Special Purpose Devices • Artificial Neural Networks (ANN) • Fuzzy logic 14

Special Purpose Devices • Artificial Neural Networks (ANN) • Fuzzy logic 14

Neural Networks (Definition) üA large number of PEs üConnected in Parallel üCapable of learning

Neural Networks (Definition) üA large number of PEs üConnected in Parallel üCapable of learning üAdaptive to changing üAble to cope with serious disruptions Power of Connectivity vs Power of Processors 15

Fuzzy logic (Definition) üApproximate reasoning üFormal principals of reasoning 16

Fuzzy logic (Definition) üApproximate reasoning üFormal principals of reasoning 16

Interconnection Network (IN) • The measure of an IN is “how quickly it can

Interconnection Network (IN) • The measure of an IN is “how quickly it can deliver how much of what’s needed to the right place, reliably and at good cost and value”. 17

Performance Criteria for IN • • Latency • Transit time for a single msg.

Performance Criteria for IN • • Latency • Transit time for a single msg. Bandwidth • how much msg. traffic the IN can handle, e. g. , Mbytes/s Connectivity • How many immediate neighbors each node has, and how often each neighbor can be reached Hardware cost • What fraction of the total hardware cost the IN represents E. g. , wires, switches, connectors, arbitration logic, … 18

Performance Criteria for IN (cont. ) • Reliability • Redundancy paths, • Functionality •

Performance Criteria for IN (cont. ) • Reliability • Redundancy paths, • Functionality • Additional functions performed by the IN, such as combining of msg. and fault tolerance • e. g. , data routing, interrupt handling, request/ message combining, coherence • Scalability • The ability to be expandable 19

Definitions • Node degree: • node degree is the number of links (edges) connected

Definitions • Node degree: • node degree is the number of links (edges) connected to the node • Diameter: • the diameter of a network is defined as the largest minimum distance between any pair of nodes. The minimum distance between a pair of nodes is the minimum number of communication links (hops) that data from one of the nodes must traverse in order to reach the other node. • Network Size • The number of nodes in the IN 20

Data Routing • Functions in data routing • Shifting • Rotation • Permutation (one-to-one)

Data Routing • Functions in data routing • Shifting • Rotation • Permutation (one-to-one) • Broadcast (one-to-all) • Multicast (many-to-many) • Personalized communication (one-to-many) • Shuffle / Exchange 21

Types of IN • Static Networks • Dynamic Networks 22

Types of IN • Static Networks • Dynamic Networks 22

Static Networks • Shared Bus • • Degree = 1 Diameter = 1 23

Static Networks • Shared Bus • • Degree = 1 Diameter = 1 23

Static Networks (cont. ) • Linear Array • • Degree = 2 Diameter =

Static Networks (cont. ) • Linear Array • • Degree = 2 Diameter = n-1 24

Static Networks (cont. ) • Ring • • Degree = 2 Diameter: • •

Static Networks (cont. ) • Ring • • Degree = 2 Diameter: • • unidirectional: n-1 bidirectional: Ceil(n-1)/2 25

Static Networks (cont. ) • Binary tree • Degree: • • Leaf=1 Root=2 Others=3

Static Networks (cont. ) • Binary tree • Degree: • • Leaf=1 Root=2 Others=3 Diameter: 2(h-1) 26

Static Networks (cont. ) • Fat tree. • Degree and Diameter is the same

Static Networks (cont. ) • Fat tree. • Degree and Diameter is the same as binary tree • Due to heavy traffic towards root, the number of links gradually increases (e. g. , CM-5). 27

Static Networks (cont. ) • Star. • Degree: • • • Central = n-1

Static Networks (cont. ) • Star. • Degree: • • • Central = n-1 Others = 1 Diameter= 2 28

Static Networks (cont. ) Source 000 001 010 111 100 101 110 011 Destination

Static Networks (cont. ) Source 000 001 010 111 100 101 110 011 Destination 000 010 100 111 001 011 101 110 Shuffle(sn-1 sn-2. . . s 0) = sn-2 sn-3. . . s 0 sn-1 Exchange(sn-1 sn-2. . . s 1 s 0) = sn-1 sn-2. . . s 1 s 0 29

Shuffle-Exchange Network • For N=8 • Applications: • The shuffle-exchange network provides suitable interconnection

Shuffle-Exchange Network • For N=8 • Applications: • The shuffle-exchange network provides suitable interconnection patterns for implementing certain parallel algorithms, such as polynomial evaluation, Fast Fourier Transform (FFT), sorting, and matrix transposition. 30

Static Networks (cont. ) n Mesh. n Degree: n n Corner= 2 Sides =

Static Networks (cont. ) n Mesh. n Degree: n n Corner= 2 Sides = 3 Middle= 4 Diameter= 2(n-1) 31

Mesh Routing Algorithm • Simple routing algorithm routes a packet from source S to

Mesh Routing Algorithm • Simple routing algorithm routes a packet from source S to destination D in a mesh with n 2 nodes. 1. Compute the row distance R as 2. Compute the column distance C as 3. Add the values R and C to the packet header at the source node. 4. Starting from the source, send the packet for R rows and then for C columns. 32

Example (Mesh) n to route a packet from node 6 (i. e. , S=6)

Example (Mesh) n to route a packet from node 6 (i. e. , S=6) to node 12 (i. e. , D =12), n the packet goes through two paths, as shown in the figure: 33

Static Networks (cont. ) • Illiac • • Degree= 4 Diameter= n-1 chordal ring

Static Networks (cont. ) • Illiac • • Degree= 4 Diameter= n-1 chordal ring 34

Static Networks (cont. ) n Torus n n Degree= 4 Diameter= 2(Ceil(n/2)) 35

Static Networks (cont. ) n Torus n n Degree= 4 Diameter= 2(Ceil(n/2)) 35

Static Networks (cont. ) • Hyper. Cube • • • Degree= n Diameter= n

Static Networks (cont. ) • Hyper. Cube • • • Degree= n Diameter= n Address Bits= n Dimensions= n Neighbors= n 36

Example Embedding a 4 -by-4 mesh in a 4 -cube 37

Example Embedding a 4 -by-4 mesh in a 4 -cube 37

Static Networks (cont. ) • n-Mesh • Degree: • • Corner= n Internal= 2

Static Networks (cont. ) • n-Mesh • Degree: • • Corner= n Internal= 2 n n < Others < 2 n Diameter= 38

Static Networks (cont. ) • k-Ary n-cube • Degree: • • • If k=2

Static Networks (cont. ) • k-Ary n-cube • Degree: • • • If k=2 then Degree = n If k>2 then Degree = 2 n Diameter= (a) 4 -ary 2 -cube network (b) 3 -ary 3 -cube network 39

Cache Coherence Multiprocessor environment Cache dedicated to each processor Cache coherence problem How to

Cache Coherence Multiprocessor environment Cache dedicated to each processor Cache coherence problem How to keep multiple copies of the data consistent during execution ? 40

Cache Coherence Mechanisms 1. Hardware-based schemes • Snoopy cache protocols • • 2. 3.

Cache Coherence Mechanisms 1. Hardware-based schemes • Snoopy cache protocols • • 2. 3. If INs have broadcast features Directory cache protocols • No broadcast features in INs Software-based schemes Combination 41

Cache Coherence Mechanisms (cont. ) • Action taken on • Read Miss • Write

Cache Coherence Mechanisms (cont. ) • Action taken on • Read Miss • Write Hit • Write Miss 42

Snoopy Cache Protocol A two-processor configuration with copies of data block x § write-through

Snoopy Cache Protocol A two-processor configuration with copies of data block x § write-through § write-back 43

Centralized Directory Protocols • Full-map protocol directory 44

Centralized Directory Protocols • Full-map protocol directory 44

Scalable Cache Coherency 45

Scalable Cache Coherency 45

Classification of Dynamic Networks 46

Classification of Dynamic Networks 46

Dynamic Networks (Crossbar) 47

Dynamic Networks (Crossbar) 47

Dynamic Networks (Single-Stage) In Single-Stage Network any permutation can be reached by at most

Dynamic Networks (Single-Stage) In Single-Stage Network any permutation can be reached by at most 3(log. N 2) -1 pass. 48

Multi Stages - Blocking • Example: Multi Stage Cube , Omega 49

Multi Stages - Blocking • Example: Multi Stage Cube , Omega 49

Multi Stages – Nonblocking • Example: Three-stage Clos 50

Multi Stages – Nonblocking • Example: Three-stage Clos 50

Dynamic Networks (Clos) 51

Dynamic Networks (Clos) 51

Multi Stages - Rearrangeable • Example: 8 -to-8 (Benes) 52

Multi Stages - Rearrangeable • Example: 8 -to-8 (Benes) 52

Interconnection Design Decisions • Considerations about selecting the Architecture of Interconnection Network • Operation

Interconnection Design Decisions • Considerations about selecting the Architecture of Interconnection Network • Operation Mode • Control Strategy • Network Topology • Switching Methodology • Functional characteristics of the switch 53

Interconnection Design Decisions • Operation mode: • • Asynchronous Combined Control Strategy • •

Interconnection Design Decisions • Operation mode: • • Asynchronous Combined Control Strategy • • • Synchronous Centralized control Distributed control Switching methodology • • • circuit switching packet switching integrated switching 54