Introduction to Scalable Interconnection Networks COE 502 Parallel
Introduction to Scalable Interconnection Networks COE 502 – Parallel Processing Architectures Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals
Scalable Interconnection Network v At Core of Parallel Computer Architecture v Transfer data from any source to any destination v Composed of links and switches Ø Elegant mathematical structure (highly regular) Ø Electrical / Optical link properties Ø Managing many traffic flows Scalable Interconnection Network v Performance Goals Ø Bandwidth ² As many concurrent transfers as possible Network Interface Ø Latency: as small as possible Ø Cost: as low as possible Interconnection Networks CA Mem COE 502 KFUPM, Muhamed Mudawar CA P Mem P Slide 2
Formalism v Interconnection Network is a graph v Vertices V = {nodes, switches} v Connected by communication channels C V × V v A Channel is a physical link Ø Includes buffers to hold data as it is being transferred Ø Phit (Physical unit) is amount of data transferred per cycle Ø t is the channel cycle: time to transmit one phit Ø Channel has signaling rate f = 1/t Ø Channel has width w and bandwidth b = w × f v Switch Degree: number of input (output) channels v Path or Route: sequence of switches and links Ø Followed by a message from its source until its destination Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 3
Network Characterization v Topology (what structure) Ø Physical interconnection structure of the network graph Ø Direct: a switch is associated with each node Ø Indirect: can have extra switches not connected to nodes Ø Regular versus Irregular Ø Most parallel machines employ highly regular topologies v Routing Algorithm (which routes) Ø Restricts the set of paths that messages may follow ² Between pairs of source and destination nodes Ø Deterministic versus adaptive ² One or multiple routes for each pair of source/destination Ø Many algorithms with different properties Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 4
Network Characterization (2) v Switching Strategy (how) Ø How data in a message traverses a route Ø Circuit switching versus packet switching ² In circuit switching, path is established and reserved § Until message traverses over circuit ² In packet switching, message is broken into packets § Packets contain routing/sequencing information, and data v Flow Control Mechanism (when) Ø When a message or portions of it traverse a route Ø What happens when messages compete for a channel? ² Blocked in place, buffered, detoured, dropped Ø Flow control unit (Flit): unit of transfer across a link ² Can be as small as a phit or as large as a packet Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 5
Ø Front end of the packet Data Payload Header v Header Trailer Typical Packet Format Ø Routing and control info Ø Used by switches to route packet in network Phit Flit v Data payload: data transmitted across network v Trailer: end of packet Ø Typically contains error-checking code v Packet is further divided into flits and phits v Example: Cray T 3 E Ø Packet is 1 -10 flits, and each flit is 5 phits Ø Flit size = 70 bits = 64 -bit data + 6 -bit control Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 6
Basic Switch Organization v Switch Consists of: Ø Set of input ports and output ports Ø Internal crossbar connecting each input to every output Ø Internal buffering Ø Control logic for routing and scheduling Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 7
Switch Components v Output ports Ø Transmitter: typically drives clock and data v Input ports Ø Receiver aligns data signal with local clock Ø Essentially FIFO buffer v Buffering at input and/or output ports v Crossbar Ø Connects each input to any output Ø Switch degree limited by number of I/O pins v Control logic Ø Complexity depends on routing and scheduling algorithm Ø Determines output port for each incoming packet Ø Arbitrates among inputs directed to same output Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 8
Physical Channel Flow Control v Asynchronous physical channel flow control Request Req Ack Buffer Data Buffer Ack Physical Channel data v Synchronous full-duplex channel flow control Buffer Clock Buffer Data/Cmd Buffer Clock Buffer Command is used for buffer management and flow control Data/Cmd Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 9
Topological Properties v Routing Distance Ø Number of links on route between a pair of nodes v Network Diameter Ø Maximum shortest path between any two nodes v Average Distance Ø Average of the routing distance between all pairs of nodes v Channel Bisection Width Ø Minimum number of channels cut ² When a network is cut into two equal halves v Wire Bisection Width Ø Channel bisection width × channel width Ø Reflects the wiring density of the network Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 10
Interconnection Topologies v Each topology is a class of networks Ø Scaling with number of nodes N v Completely connected network Ø Each node has a switch Ø Directly connected to all other nodes Ø Node Degree = N – 1 Ø Diameter = 1 link Ø Links = N (N – 1) / 2 Ø Bisection width = (N/2)2 ² Each of the (N/2) nodes in the first half is connected to all the (N/2) nodes in the second half Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 11
Linear Array v Switch associated with each node v Connected by bidirectional links v Number of links = N – 1 v Diameter = N – 1 v Average distance = (N+1)/3 v Node Degree = 2 v Bisection width = 1 link Ø Removal of a single link partitions the network v One route between a pair of nodes Ø Route A → B is given by relative address R = B – A Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 12
Ring v Symmetric, Number of links = N v Bidirectional Links Ø Diameter = N / 2 switch associated with each node Ø Node Degree = 2 bidirectional links Ø Average distance = N 2/ 4(N – 1) Ø Bisection width = 2 links Ø Two routes between a pair of nodes v Unidirectional Links Ø Diameter = N – 1 Ø Node Degree = 1 arranged to use short wires Ø Average distance = N / 2 unidirectional links Ø Bisection width = 1 link Ø One route between a pair of nodes Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 13
Multidimensional Meshes v d-dimensional array Ø N = k 0 ×. . . × kd-1 nodes Ø ki nodes in dimension i Ø Node degree is between d and 2 d Ø Each node identified by d-vector of coordinates (x 0, … , xd-1) ² Where 0 ≤ xi ≤ ki – 1 for 0 ≤ i ≤ d – 1 v If number of nodes is same (k) in all dimensions … Ø Then d-dimensional k-ary mesh Ø N = kd Ø Network diameter = d(k– 1) Ø Bisection width = kd– 1 Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 14
Multidimensional Tori v Symmetric with wrap around edges v Node degree = 2 d v N = k 0 ×. . . × kd-1 nodes v ki nodes in dimension i Ø Each node identified by d-vector of coordinates (x 0, . . . , xd-1) ² Where 0 ≤ xi ≤ ki – 1 for 0 ≤ i ≤ d – 1 v If number of nodes is same (k) in all dimensions … Ø Then d-dimensional k-ary torus Ø N = kd Ø Network diameter = d k/2 Ø Number of links = d N Ø Bisection width = 2 kd– 1 Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 15
Hypercube v Special case of d-dimensional k-ary mesh v Called also d-cube v d dimensions v Two nodes along each dimension v Node degree = d v N = 2 d nodes v Network diameter = d v Number of links = d N / 2 v Bisection width = N / 2 Interconnection Networks COE 502 KFUPM, Muhamed Mudawar 4 -cube Slide 16
Latency v Time to transfer n bytes from source to destination Overhead + Unloaded Network Latency + Contention Delay v Overhead Ø Time to get message into and out of network Ø Node-to-network interface v Unloaded Network Latency Ø Time to transfer a packet through network Ø Assuming no contention in the network Ø Further divided into: channel occupancy + routing delay v Contention Delay Ø Contention adds queuing delays (waiting time in buffers) Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 17
Store-and-Forward Routing v Entire packet is received at a switch and then … Ø Forwarded on the next link along the path Source 3 2 1 0 3 2 1 3 2 3 Dest 0 1 0 2 1 0 3 2 1 3 2 3 Δ Packet = n bytes Routing distance = h Additional Switch Delay = Link bandwidth = b bytes/sec Unloaded network latency: TSF (n, h) = h (n/b + Δ) Interconnection Networks COE 502 KFUPM, Muhamed Mudawar 4 -flit packet traverses 3 hops from source to destination 0 1 0 2 1 0 3 2 1 3 2 3 0 1 0 2 1 0 3 2 1 0 Slide 18
Cut-through Routing v Transmission of a single packet is pipelined v Switch makes it decision after examining header flit Ø Advances header before receiving remaining flits v Header establishes route from source to destination Ø A single packet may occupy entire route Ø Tail (last) flit clears route as it moves through Source Packet = n bytes 3 Routing distance = h Routing delay per hop = Δ Link bandwidth = b bytes/s 2 1 0 3 2 1 3 2 3 Unloaded network latency: TCT (n, h) = n/b + hΔ Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Dest 0 1 2 3 0 1 0 2 1 0 3 2 1 0 Slide 19
Channel Occupancy v Time for a packet to cross a channel v Channel Occupancy = n/b = (n. D + n. E)/b Ø Packet = n bytes = n. D + n. E (data + envelop) Ø Packet envelop include the header and trailer flits ² Typically discarded when a packet reaches its destination ² Counted as an overhead (routing info, error codes, etc. ) Ø Packet efficiency = n. D / (n. D + n. E) Ø Channel bandwidth b = w f = w / t v Channel Occupancy for store-and-forward = h × n / b Ø Not overlapped along route, multiplied by distance h v Channel Occupancy for cut-through routing = n / b Ø Overlapped in time and does not depend on distance h Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 20
Routing Delay v Time to route header flit from source to destination v Is a function of Ø Routing distance h and Ø Routing delay Δ incurred at each hop along the path v Routing Delay = h Δ Ø For both store-and-forward and cut-through routing v Δ is the routing delay per hop, which includes Ø Routing logic delay to determine output port for a header flit Ø Crossbar delay to advance header flit from input to output Ø Once a path has been established for a header flit ² All remaining flits will simply follow with no additional delay Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 21
Real Machines Machine n. Cube/2 TMC CM-5 IBM SP-2 Intel Paragon Meiko CS-2 CRAY T 3 D DASH J-Machine Monsoon SGI Origin Myricom Interconnection Networks Topology Cycle Time (ns) Hypercube Fat-Tree Banyan 2 D Mesh Fat-Tree 3 D Torus 3 D Mesh Butterfly Hypercube Arbitrary 25 25 25 11. 5 20 6. 67 30 31 20 2. 5 6. 25 Channel Routing Width Delay (bits) (cycles) 1 4 8 16 16 8 16 20 16 COE 502 KFUPM, Muhamed Mudawar 40 10 5 2 7 2 2 16 50 Flit (bits) 32 4 16 16 8 16 160 16 Slide 22
Contention v Two packets trying to use same link at same time Ø Depends on topology, destination, and routing algorithm v Contention adds queuing delay to basic routing delay v Mechanism for dealing with contention Ø Means of buffering ² Buffer entire packet ² Buffer few flits of a packet Ø What happens when buffer is full? ² Discard packet ² Back pressure toward the source Ø Means of arbitration for the output channels Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 23
Mechanisms for Contention v Store-and-forward Ø Entire packet is blocked in buffer until arbiter selects it Ø What happens to incoming packets when buffer is full? ² Handshake between output and input port across a link ² Packet heading to a full buffer is blocked in place ² Discarded in traditional networks because of long links v Cut-through: two mechanisms exist for contention Ø Virtual Cut-through ² Buffer space is large enough to store the entire blocked packet ² Frees previous buffers along the route Ø Wormhole ² Buffer space can hold one of few flits of a packet ² Packet is blocked in all buffers along its route Ø Eventually the source experiences back pressure Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 24
Routing v Routing algorithm determines Ø Which of the possible paths are used as routes Ø Routing algorithm is a function R : V × V → C Ø At each switch V, routing function maps ² Destination node V to next channel C on route v Routing mechanisms Ø Simple Arithmetic: minimal computation in few cycles ² Works in most regular topologies Ø Source-Based Routing ² Source builds a header consisting of the output port numbers ² Each switch simply removes one port number from header flit Ø Routing Table R ² Header contains a routing field i, output port o = R [ i ] ² Routing table also gives the routing field for next step j = R [ i ] Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 25
Routing Mechanisms – cont’d v. Source-based Ø Ø Ø Routing algorithm is applied at source node, not in switches Source node computes a series of output port selects Ports are carried in message header P 3 P 2 P 1 Used by switches and stripped en route Very simple switch design but header tends to be large Examples: CS-2, Myrinet, MIT Artic P 0 v. Table-driven Ø Message header carries routing index for next switch Ø Routing table is indexed to obtain output port and next index ( o , j ) = R [ i ], where o = output port and j = next index Ø Example: ATM - Not common in interconnection networks Ø Fairly large tables even for simple routing algorithms Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 26
Deterministic Routing v Unique path between every source and destination v Dimension-Order Routing (DOR) in 2 D Mesh Ø Each packet carries a signed distance [Δx, Δy] in its header Ø Route along X dimension first, then along Y dimension Condition Δx < 0 Δx > 0 Δx = 0, Δy < 0 Δx = 0, Δy > 0 Δx = 0, Δy = 0 Direction (Output port) and Action West (-X), Increment Δx East (+X), Decrement Δx South (-Y), Increment Δy North (+Y), Decrement Δy Processor Ø Can be generalized to k-ary d-dimensional meshes and tori v Similar e-cube routing in d-dimensional hypercube Ø One routing bit per dimension Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 27
DOR and E-Cube Routing Examples v Examples on Dimension Order Routing (DOR) v Examples on e-Cube Routing Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 28
Adaptive Routing v Multiple paths may exit between source & destination v Routing algorithm determines multiple output ports Ø For an incoming packet based on destination address v Selection function is used to select an output port Ø Based on traffic and contention to output ports v Minimal adaptive routing Ø Minimal paths are chosen between source & destination v Example showing 5 minimal paths between 2 nodes Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 29
Deadlock v How can it arise? Ø Necessary conditions: ² Shared resources § Channels and buffers ² Incrementally allocated § When header flit arrives ² No preemption § Remain allocated until last flit ² Cyclic dependencies 4 messages waiting on each other in a cyclic manner § Messages are waiting on each other in a cyclic manner v How to prevent deadlock? Ø Break cyclic dependencies by ² Constraining resource allocation Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 30
Deadlock-Free Routing v Deadlocks are a disaster for a parallel machine Ø Once a deadlock happens, no progress can take place Ø Until machine is restarted and buffers are reset and cleared v Packets introduce dependences between channels Ø As they move forward between source and destination v Channel Dependence Graph Ø Describes dependences between channels ² For a given topology and routing algorithm Ø Has a node for every unidirectional link in the network Ø Arc from node a to node b if … ² It is possible for a packet to traverse from channel a to b Ø No cycles in graph Deadlock-free routing Interconnection Networks COE 502 KFUPM, Muhamed Mudawar Slide 31
DOR in 2 D Mesh v To Prove: DOR in 2 D Mesh is Deadlock Free v Assign Channel Numbers Ø Such that every legal route follows an ordered sequence Ø Either monotonically increasing or decreasing v In this example, k = 4 and N = 16 Ø Channel Numbering +X : (x, y) → (x+1, y) gets 2 k y + x –X : (x, y) → (x– 1, y) gets 2 k (y + 1) – x +Y : (x, y) → (x, y+1) gets 2 (N + k x) + y –Y : (x, y) → (x, y– 1) gets 2 (N + k x + k) – y Ø Any routing sequence: X turn Y is always increasing Interconnection Networks COE 502 KFUPM, Muhamed Mudawar +Y 0, 3 34 16 8 0 25 17 9 1 26 18 10 2 62 3, 1 56 5 61 3, 2 57 13 55 2, 0 3, 3 58 21 54 2, 1 48 6 29 53 2, 2 49 14 47 1, 0 2, 3 50 22 46 1, 1 40 7 30 45 1, 2 41 15 39 0, 0 1, 3 42 23 38 0, 1 32 24 37 0, 2 33 31 +X 63 3, 0 Slide 32
Channel Dependence Graph v Channel dependency graph shows all possible routes for DOR in a 2 D Mesh network v No cycles => DOR in 2 D mesh is deadlock free 0, 3 34 23 16 8 7 0 25 22 17 9 Interconnection Networks 6 1 26 18 10 2 29 24 25 26 42 45 50 53 58 61 23 22 21 16 17 18 33 38 61 41 46 49 54 57 62 15 14 13 8 9 10 62 3, 1 56 5 30 34 37 3, 2 57 13 55 2, 0 3, 3 58 21 54 2, 1 48 29 53 2, 2 49 14 47 1, 0 2, 3 50 46 1, 1 40 30 45 1, 2 41 15 39 0, 0 1, 3 42 38 0, 1 32 24 37 0, 2 33 31 31 63 3, 0 32 39 40 47 48 55 56 63 7 6 5 0 1 2 COE 502 KFUPM, Muhamed Mudawar Slide 33
- Slides: 33