http comp is uec ac jpyoshinagalabyoshinagadp 2 html
内容 • 分散・並列処理計算機における相互結合ネット ワークとその上でのメッセージ・ルーティング技 法などについて学ぶ • 資料 http: //comp. is. uec. ac. jp/yoshinagalab/yoshinaga/dp 2. html • http: // www. gap. upv. es/slides/Appendix. E. ppt (253 slides, 13 MB) • TA: Guan Yicheng君 guan@comp. is. uec. ac. jp 邱平 君 pingqui@comp. is. uec. ac. jp NC論2 2
References • T. M. Pinkston and J. Duato: Interconnection Networks, Appendix E in Computer Architecture: A Quantitative Approach, 4 th Edition, Morgan Kaufmann publishers (2006). • J. Duato, S. Yalamanchili, L. Ni: Interconnection Networks - an Engineering Approach-, IEEE CS press (1997) • 同第 2版, Morgan Kaufmann publishers (2003) • 富田眞治: 並列コンピュータ、昭晃堂(1996) • W. D. Dally, B. Towles: Principles and Practices of Interconnection Networks, Morgan 3 NC論2 Kaufmann publishers (2003)
What is an interconnection Network? • It is a programmable system that transports data between terminals, such as processors and memory. • It is programmable in the sense that it makes different connections at different points. • It is a system because it is composed of many components: buffers, channels, switches, and controls that works together to deliver data. NC論2 4
Interconnection Network (1/2) Interconnection Network P P P M Multicomputer NC論2 5
Interconnection Network (2/2) P P P Interconnection Network M M M UMA type shared memory multiprocessor It is also called dance-hall architecture. NC論2 6
Trend • Its performance is increasing with processor performance at a rate of 50% per year. • Communication is a limiting factor in the performance of many modern systems. • Buses have been unable to keep up with the bandwidth demand, and point-to-point interconnection networks are rapidly taking over. NC論2 7
Computer Classifications (%) 2011/06 2010/06 2009/06 MPP 17. 4 14. 8 17. 6 Cluster 82. 2 84. 8 82. 0 Others 0. 4 http: //www. top 500. org/ share of the TOP 500 June, 2011 – June, 2009 NC論2 8
Examples of MPPs processor Topology #proc. K computer @RIKEN Fujitsu 2011 SPARC 64 VIIIfx 2 GHz (16 GFlops× 8 cores) 6 D mesh/ 3 D torus Tofu interconnect 80 K-node x 8 -core = 640 K-core Jaguar@ORNL Cray XT 5 -HE 2009 AMD x 86_64 Opteron 2. 6 GHz (10. 4 GFlops× 6 cores) 3 D torus Sea. Star interconnect 18, 688 -node x 2 -way x 6 -core = 37, 376 x 6 -core = 224, 256 -core NC論2 9
Examples of clusters processors GPU Interconnect Tianhe-1 A (天河一号A) China 2010 Intel EM 64 T Xeon X 5670 2. 93 GHz (11. 72 GFlops) × 14, 336 NVIDIA Tesla M 2050 (515 GFlops) × 7, 168 Galaxy 160 Gbps/link (proprietary) Fat tree Tsubame 2. 0 Tokyo Tech. 2010 Xeon X 5670 Tesla M 2050 Infiniband 2. 93 GHz× 1, 408 (515 GFlops) QDR (40 Gbps) + Xeon E 7520 × 2 × 1, 048× 3 2 GHz× 34 Fat tree NC論2 10
Other Networks of Supercomputers • Cray XE 6 (2011): 3 D torus, proprietary GEMINI link) • Pleiades / NASA (2011): partial 11 D hypercube topology with IB QDR/DDR • Red Sky/ Sandia National Lab. (2010): 3 D torus (12 bristled node) with IB QDR switches • IBM Roadrunner (2009): fat-tree with IB DDR • Earth Simulator 2 / NEC SX-9 E (2009): Fat-Tree (64 GB/s/cpu, 8 -CPU/node, 160 nodes) • IBM Blue Gene/L (2004): 3 D torus proprietary (64 x 32 = 64 K nodes) NC論2 11
Architecture vs. software memory programming UMA (SMP) shared Open. MP NUMA (MPP) distributed (not shared) MPI (message Passing Interface) NC論2 12
Network Design (1/3) • Performance: latency and throughput (bandwidth) • Scalability: #processors vs. network, memory, I/O bandwidth • Incremental expandability: small to maximum size • Partitionability: netwrok may be partitioned for several users NC論2 13
Network Design (2/3) • Simplicity: simple design, higher clock frequency, easy to use • Distance span: smaller system is preferred for noise and cable delay, etc. • Physical constraints: packaging (pin count), wiring(wire length), and maintenance (power consumption) should meet physical limitation. NC論2 14
Network Design (3/3) • Reliability: fault tolerant, reliable communication, hot swap • Expected workload: robust performance over a wade range of traffic conditions. • Cost: trade-offs between cost and performance. NC論2 15
Classifiction of Interconnection Networks • Shared-Medium Networks – Local area networks (ethernet, token ring) – Backplane bus (e. g. SUN Gigaplane) • Direct Networks (router-based) – mesh, torus, hypercube, tree, … etc. • Indirect Networks (switch-based) • Hybrid Networks NC論2 16
Shared-Medium Networks (LAN) • Arbitration that determines the mastership of the shared-medium network to resolve network access is needed. • The most well-known protocol is carrier-sense multiple access with collision detection (CSMA/CD). • Token bus and token ring pass a token from the owner which has the right to access the bus/ring and resolve nondeterministic waiting time. NC論2 17
Shared-Medium Networks (Backplane bus) • It is commonly used to interconnect processor(s) and memory modules to provide SMP (Symmetrical Memory Processor) architecture. • It is realized by printed lines on a circuit board by discrete wiring. • Gigaplane in SUN Enterprise x 000 server(1996): 2. 6 GB/s, 256 bits data, 42 bits address, 83. 8 MHz clock. NC論2 18
Direct (static) Networks • Consists of a set of nodes. • Each node is directly connected to a subset of other nodes in the network. • Examples: – 2 D mesh (intel Paragon), 3 D mesh (MIT J-Mahine) – 2 D torus (Fujitsu AP 3000), 3 D torus (Cray T 3 D, T 3 E) – Hypercube (CM 1, CM 2, n. CUBE) NC論2 19
Mesh topology node 2 D 3 D NC論2 20
Torus topology 2 D 3 D (4 -ary 2 -cube) (3 -ary 3 -cube) NC論2 21
Hypercube (binary n-cube) 4 D (2 -ary 4 -cube) NC論2 22
tree Binary tree fat tree NC論2 x tree 23
Hierarchical topology (1/2) Pyramid Hierarchical ring (Hierarchical 2 D mesh) NC論2 24
Hierarchical topology (2/2) Cube-connected cycles RDT (Recursive Diagonal Torus) NC論2 25
Hypermesh (spaninng-bus hypercube) Single or multiple buses NC論2 26
Base-m n-cube (hyper-crossbar) 770 070 777 077 707 000 007 8 x 8 crossbar Base-8 3 -cube (Toshiba Prodigy) NC論2 27
Diameter and degrees (1/2) 2 D mesh 2 D torus 3 D torus binary n-cube #node N N N N = 2 n Diameter 2√N √N √N log N degree 4 4 6 log N NC論2 3 28
Diameter and degrees (2/2) Base-m n -cube #node Binary tree ring N N 2 log N N/2 3 2 N = mn N = n 2 n Diameter logm N degree CCC logm N 3 n/2 3 NC論2 3 29
- Slides: 29