Interconnect Networks Basics Generic paralleldistributed system architecture Onchip
Interconnect Networks Basics
Generic parallel/distributed system architecture • On-chip interconnects (manycore processor) • Off-chip interconnects (clusters of servers)
Interconnection network performance • Latency: how much time does it take between the time when a send of 1 byte is issued and the time when the receive of the data is completed? – Signal propogation delay + router queuing delay • Bandwidth: how much time to send a large amount of data (e. g. 1 MB)? • Examples: – Ethernet: • Bandwidth 100 Mbps, 1 Gbps, 100 Gbps • Latency: 25 us -100 us (user level, single hop, try ping between linprog’s) – Infini. Band • Bandwidth: 20 Gbps, 40 Gbps, 54 Gbps, 80 Gbps, …… • Latency: 1 -3 us (user level, single hop)
Interconnection network performance • Latency and Bandwidth – Different levels • User level: the performance that users feel • Systems level, device level • Which level will have the highest bandwidth? – Example: 1 Gbps Ethernet, 800 Mbps at system level, 650 Mbps at the user level. • 1 Gbps Ethernet, which level? • 0. 115 ms ping latency, which level? – Some measurement trap: single pair. vs. multiple pair.
Network components • Network interface (card) • Communication between a node and the network • Link • Bundle of wires and fibers that carry signals • Switches • Connects a fixed number of input channels to a fixed number of output channels. • In this community, switches may also have the router functions.
Switch The cross-bar can realize a communication from any input port to any output port. • The simplest form is a dedicated computer with memory (e. g. linux router).
Most expensive form: Cross-bar functionality – all permutations can be realized simultaneously i n p u t 1 2 1 2 3 3 3 4 4 4 1 2 3 4 output A 4 x 4 cross-bar 1 2 3 4 (1, 2, 3, 4)-> (3, 1, 2, 4) 1 2 3 4 (1, 2, 3, 4)->(4, 3, 2, 2) Only (1, 2, 3, 4)->(4, 3, 2, -) Permutation: (1, 2, 3, 4) -> (3, 1, 2, 4) A communication pattern where each source happens once, each destination happens once. The input registers send control signals to the control, routing, scheduling module indicating the pattern; the control module computes and sets the dots.
Switch example: 24 -port 1 Gbps Ethernet switch • 24 input ports and 24 output ports – each Ethernet jacket has one input port and one output port. • All 24 machines can send and receive simultaneously. switch Ethernet card machine
Alternatives to cross-bars • A question: why buffers when we can always do permutation? • An N x N cross bar has O(N^2) cross points (on/off switches). – Not scalable, expensive • An alternative for low end switches: bus and memory – When bus and memory is fast enough, moving data between input and output ports are like memory copy in a typical computer.
Bus and memory alternative to crossbar • Realizing (1, 2, 3, 4) -> (4, 3, 2, 1) – Read from input port 1 to memory A – Read from input port 2 to memory B – Read from input port 3 to memory C – Read from input port 4 to memory D – Run forwarding logic (find out the output ports) – Write A to output port 4 – Write B to output port 3 – Write C to output port 2 – Write D to output port 1
Bus and memory alternative to crossbar • A typical northbridge bandwidth is a few GBps. Let us assume the bandwidth is 4 GBps, how many ports can the northbridge support in 100 Mbps Ethernet swithes?
Another alternative: multistage interconnection network • Realize all permutations without controlling O(N^2) cross-points. – Clos networks, Benes networks Each of the dot is a 2 x 2 switch, controlled by two states. 0 1 How to realize 0000 ->0000, 0001 ->0001, 0010 ->1011?
Switch • All approximate crossbars – High end ones are equivalent to or close to crossbars: all permutations can happens simultaneously. – Low end ones will have limited total bandwidth (aggregate bandwidth). • Example: High end and low end 24 port 1 Gbps switch connecting 24 computers. – With one pair of Source/destination, the throughput will be about 800 Mbps for both (no difference). – When 24 pairs send/receive at the same time • High end one will get 24*800 Mbps • Low end one will get a total of X Mbps, X < 24*800 Mbps (X can sometimes be about 5*800 Mbps) – Different pairs may also have different throughput depending on the scheduling algorithm.
Network level components • Topology (what) – Physical interconnection structure of the network graph. – Physically limits the performance of the networks. • Routing algorithm (which) – Restricts the set of paths that messages can follow. • Switching strategy (how) – How data in a message traverses a route (passing routers) • Flow control mechanism (when) – When a message or portions of it traverse a route – What happens when traffic collides
Topology • How the components are connected. • Important properties • Diameter: maximum distance between any two nodes in the network (hop count, or # of links). • Nodal degree: how many links connect to each node. • Bisection bandwidth: The smallest bandwidth between half of the nodes to another half of the nodes. • A good topology: small diameter, small nodal degree, large bisection bandwidth.
Topology • Regular topologies – Nodes are connected with some kind of patterns. • The graph has a structure. – Nodes are identified by coordinates. – Routing can usually pre-determined by the coordinates of the nodes. • Irregular topologies – Nodes are connected arbitrarily. • The graph does not have a structure, e. g. internet • More extensible in comparison to regular topology. – Usually use variations of shortest path routing.
Example regular topology: complete binary tree • Nodal degree = ? • Diameter = ? • Bisection bandwidth = ?
Example regular topology: ring topology 0 1 2 3 • Nodal degree = ? • Diameter = ? • Bisection bandwidth = ? 4
Routing: deciding which path to take from a source to a destination 0 1 2 3 4 • 0 to 1: 0 ->1 or 0 ->4 ->3 ->2 ->1 • Which path to use? This is a routing issue. • Routing objective: – Minimize resources used • Shortest path routing – The load on all links are as balanced as possible (load balancing). • ? ? ?
Classification of routing schemes 0 1 2 3 4 • 0 to 1: 0 ->1 or 0 ->4 ->3 ->2 ->1 • Deterministic. vs. adaptive – Deterministic – always the same route – Adaptive – choose load depending on traffic condition? • Minimal routing: always use shortest path • Source routing: the source node supplies the path • Destination routing: routing based on destination ID
Switching • Communication data units: – Message – Packet – Flit • How a packet passes a switch. • Circuit switching – circuit setup, all data pass through • Packet switching: the whole packet stored in a switch, and then forwarded to the next hop
Flow-control • Used between hops to make sure that when data is sent, there is available buffer for the data. • Built into switching mechanism sometimes.
- Slides: 22