Indirect Adaptive Routing on Large Scale Interconnection Networks
- Slides: 27
Indirect Adaptive Routing on Large Scale Interconnection Networks Nan Jiang, William J. Dally John Kim Computer System Laboratory Stanford University Korean Advanced Institute of Science and Technology 1
Overview • Indirect adaptive routing (IAR) – Allow adaptive routing decision to be based on local and remote congestion information • Main contributions – – Three new IAR algorithms for large scale networks Steady state and transient performance evaluations Impact of network configurations Cost of implementation 2
Presentation Outline • Background – The dragonfly network – Adaptive routing • Indirect adaptive routing algorithms • Performance results • Implementation considerations 3
The Dragonfly Network • High Radix Network – – • Each router – – • Global Network Group 0 Group 1 Group 2 … Three types of channels Directly connected to a few other groups Each group – – p 1 Router 0 Router 1 Router 2 … … Large network with a global diameter of one p 0 … Organized by a local network Large number of global channels (GC) … • High radix routers Small network diameter Local Network 4
Routing on the Dragonfly • Minimal Routing (MIN) 1. Source local network 2. Global network 3. Destination local network • Some Adversarial traffic congests the global channels Group 0 Group 1 Group 2 … – Each group i sends all packets to group i+1 Congestion Router 0 Router 1 … p 1 Router 2 … … – Poor performance on benign traffic p 0 … • Oblivious solution: Valiant’s Algorithm (VAL) 5
Adaptive Routing • Choose between the MIN path and a VAL path at the packet source [Singh'05] – Decision metric: path delay – Delay: product of path distance and path queue depth • Measuring path queue length is unrealistic • Use local queues length to approximate path q 0 q 1 MIN GC VAL GC Congestion q 2 q 3 – Require stiff backpressure Source Router 6
Adaptive Routing: Worst Case Traffic 450 Packet Latency (Simulation cycles) 400 350 300 250 200 Valiant’s Minimal Adaptive 150 100 0 0. 1 0. 2 0. 3 Throughput (Flit Injection Rate) 0. 4 0. 5 7
Indirect Adaptive Routing • Improve routing decision through remote congestion information • Previous method: – Credit round trip [Kim et. al ISCA’ 08] • Three new methods: – Reservation – Piggyback – Progressive 8
Credit Round Trip (CRT) • Delay the return of local credits to the congested router • Creates the illusion of stiffer backpressure MIN GC VAL GC Congestion • Drawbacks – Remote congestion is still inferred through local queues – Information not up to date Credits Delayed Credits Source Router [Kim et. al ISCA’ 08] 9
Reservation (RES) • Each global channel track the number of incoming MIN packets • Injected packets creates a reservation flit • Routing decision based on the reservation outcome MIN GC Congestion RES Failed • Drawbacks – Reservation flit flooding – Reservation delay VAL GC RES Flit Source Router 10
Piggyback (PB) • Local congestion broadcast – Piggybacking on each packet – Send on idle channels • Congestion data compression MIN GC VAL GC Congestion • Drawbacks – Consumes extra bandwidth – Congestion information not up to date (broadcast delay) GC Free GC Busy Source Router 11
Progressive (PAR) • MIN routing decisions at the source are not final • VAL decisions are final • Switch to VAL when encountering congestion MIN GC VAL GC Congestion • Draw backs – Need an additional virtual channel to avoid deadlock – Add extra hops Source Router 12
Experimental Setup • Fully connected local and global networks – 33 groups – 1, 056 nodes • 10 cycle local channel latency • 100 cycle global channel latency • 10 -flit packets 13
Steady State Traffic: Uniform Random 300 Packet Latency (Simulation cycles) 280 260 240 Piggyback Credit Round Trip Progressive Reservation Minimal 220 200 180 160 140 120 100 0 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 0. 7 Throughput (Flit Injection Rate) 0. 8 0. 9 14
Steady State Traffic: Worst Case 450 Packet Latency (Simulation cycles) 400 350 Piggyback Credit Round Trip Progressive Reservation Valiant’s 300 250 200 150 100 0 0. 1 0. 2 0. 3 Throughput (Flit Injection Rate) 0. 4 0. 5 15
Transient Traffic: Uniform Random to Worst Case Average Packet Latency per Cycle - UR to WC Packet Latency 500 400 300 200 % of Packets Routing Nonminimally 100 Progressive Piggyback 0 20 40 60 Cycles After Transition 80 100 % Packets Routing Non-minimally per Cycle - UR to WC 100 50 0 Progressive Piggyback 0 20 40 60 Cycles After Transition 80 100 16
Network Configuration Considerations • Packet size – RES requires long packets to amortize reservation flit cost – Routing decision is done on per packet basis • Channel latency – Affects information delay (CRT, PB) – Affects packet delay (PAR, RES) • Network size – Affects information bandwidth overhead (RES, PB) • Global diameter greater than one – Need to exchange congestion information on the global network 17
Cost Considerations • Credit round trip – Credit delay tracker for every local channel • Reservation – Reservation counter for every global channel – Additional buffering at the injection port to store packets waiting for reservation • Piggyback – Global channel lookup table for every router – Increase in packet size • Progressive – Extra virtual channel for deadlock avoidance 18
Conclusion • Three new indirect adaptive routing algorithms for large scale networks • Performance and design evaluation of the algorithms • Best Algorithm? – Piggyback performed the best under steady state traffic – Progressive responded fastest to transient changes – Network configurations will affect some algorithm performance – Cost of implementation 19
Thank You! • Questions? 20
Adaptive Routing: Uniform Traffic 300 VAL MIN Adaptive Packet Latency - Simulation cycles 280 260 240 220 200 180 160 140 120 100 0 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 Throughput - Flit Injection Rate 0. 7 0. 8 0. 9 21
Transient Traffic: Worst Case to Uniform Random 22
Transient Traffic: Worst Case 1 to Worst Case 10 23
1000 Random Permutation Traffic CRT 25 25 25 20 15 10 5 0 200 300 Packet Latency 25 25 % of 1 K Permutations 30 15 10 5 0 200 300 Packet Latency 20 15 10 5 0 200 300 Packet Latency VAL RES 30 20 % of 1 K Permutations 30 0 % of 1 K Permutations PAR 30 % of 1 K Permutations PB 30 20 15 10 5 0 200 300 Packet Latency 24
Effect of Packet size on RES: Worst Case Traffic 550 500 Latency - Simulation cycles 450 400 350 300 250 200 150 1 Flit 2 Flits 4 Flits 8 Flits 100 50 0 0 0. 1 0. 2 0. 3 Throughput - Flit Injection Rate 0. 4 0. 5 25
Large local network: Uniform Random 400 Packet Latency - Simulation cycles 350 300 250 200 150 PB CRT MIN PAR RES 100 50 0 0 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 Throughput - Flit Injection Rate 0. 7 0. 8 0. 9 26
Large local network: Worst Case 600 Packet Latency - Simulation cycles 500 400 300 200 PB CRT PAR RES VAL 100 0 0 0. 1 0. 2 0. 3 Throughput - Flit Injection Rate 0. 4 0. 5 27
- Neural language model
- Static interconnection network
- Direct interconnection networks
- Dynamic interconnection network
- Interconnection networks in multiprocessor systems
- Linear geography
- Direct statement scale
- Map scale ratio
- Introduction to topographic maps
- Geography skills handbook
- Level pool routing
- Static routing and dynamic routing
- Hydrologic routing and hydraulic routing
- Give comparison of clock routing and power routing
- Computer networks routing algorithms
- Broadcast routing in computer networks
- Datagram switching vs virtual circuit
- Basestore iptv
- Finding community structure in very large networks
- The anatomy of a large scale hypertextual web search engine
- The anatomy of a large-scale hypertextual web search engine
- Large rotating air mass
- Pilot scale fermenter
- Large scale map definition
- Ultra large scale
- Large scale global investment
- Automatic wrappers for large scale web extraction
- Berk atikoglu