Networking 9182020 Supercomputing 99 1 Conference facilities State

  • Slides: 62
Download presentation
Networking 9/18/2020 Supercomputing 99 1

Networking 9/18/2020 Supercomputing 99 1

Conference facilities • State of the art show floor network: – multiple OC-192 (10

Conference facilities • State of the art show floor network: – multiple OC-192 (10 GBit/s) rings – Dense Wave Division Multiplex (DWDM) • External network at OC 48 (2. 5 Gbit/s) • In house Gigabit Ethernet • Demo of 10 -G Ethernet (formerly Xnet) 9/18/2020 Supercomputing 99, Processors 2

ASCI Networking plans 9/18/2020 Supercomputing 99, Processors 3

ASCI Networking plans 9/18/2020 Supercomputing 99, Processors 3

Internet development forecast • Talk of Vinton Cerf (excerpts) 9/18/2020 Supercomputing 99, Processors 4

Internet development forecast • Talk of Vinton Cerf (excerpts) 9/18/2020 Supercomputing 99, Processors 4

9/18/2020 Supercomputing 99, Processors 5

9/18/2020 Supercomputing 99, Processors 5

Internet Hosts (000 s) 1989 -2006

Internet Hosts (000 s) 1989 -2006

Observations • • 75% of traffic on Internet is WWW Data Domination (20% voice,

Observations • • 75% of traffic on Internet is WWW Data Domination (20% voice, 80% data) Traffic growth 100 -1000%/year reported 300 M - 1000 M users by Dec 2000 • Internet MUCH faster growing than CPU power 9/18/2020 Supercomputing 99, Processors 7

Internet-enabled Devices • Information appliances – 1997 - 3 M, 1998 - 6 M,

Internet-enabled Devices • Information appliances – 1997 - 3 M, 1998 - 6 M, 2002 - 56 M (IDC) • Web. TV, Palm-Pilot, Nokia 9000, Sony, Nintendo, Sega games • Refrigerator (and the bathroom scales) • Automobiles, household appliances (turning a box of soap into a service) • Web-server on a chip (see next slide) 9/18/2020 Supercomputing 99, Processors 8

UMASS Web server on a chip born 10 AM, 14 July 1999 • TCP/IP

UMASS Web server on a chip born 10 AM, 14 July 1999 • TCP/IP code itself fits in about 256 bytes (12 -bit) • PIC 12 C 509 A, running at 4 MHz • 24 LC 256 i 2 c EEPROM • HTTP 1. 0 and RFC 1122 compliant • eternity. cs. umass. edu: 9080/index 0. html 9/18/2020 Supercomputing 99, Processors 9

Space: the final frontier Our 25 year mission: to go where no network has

Space: the final frontier Our 25 year mission: to go where no network has gone before!

9/18/2020 Supercomputing 99, Processors 11

9/18/2020 Supercomputing 99, Processors 11

9/18/2020 Supercomputing 99, Processors 12

9/18/2020 Supercomputing 99, Processors 12

 • End-to-end information flow across the solar system • Layered architecture for evolvability

• End-to-end information flow across the solar system • Layered architecture for evolvability and interoperability • IP-like protocol suite tailored to operate over long round trip light times • Integrated communications and navigation services 9/18/2020 Supercomputing 99, Processors 13

Interplanetary Internet Status • Part of the Mars Mission Plan • Possible Earth/Moon mission

Interplanetary Internet Status • Part of the Mars Mission Plan • Possible Earth/Moon mission 2001 • Low Mars Orbit and Areosynchronous satellites by 2008 • Mars Outposts by 2010 • Possible Orbiting manned mission 2018 • Possible Manned Mars station 2030? ? • Stable Interplanetary 2040? 9/18/2020 Supercomputingbackbone 99, Processors 14

Grid Computing / Batch 9/18/2020 Supercomputing 99 15

Grid Computing / Batch 9/18/2020 Supercomputing 99 15

9/18/2020 Supercomputing 99, Processors 16

9/18/2020 Supercomputing 99, Processors 16

Grid computing • geographically distributed computing • similar to Metacomputing in Europe • several

Grid computing • geographically distributed computing • similar to Metacomputing in Europe • several toolkits to enable GRID computing – (compare with UNICORE in Europe) • GLOBUS • LEGION • others 9/18/2020 Supercomputing 99, Processors 17

GLOBUS projects http: //www. globus. org • GUSTO – Testbed (as shown on SC

GLOBUS projects http: //www. globus. org • GUSTO – Testbed (as shown on SC 98) • CACTUS – parallel finite difference simulation codes • CMT – Microtomography • Flash – Seamless access to remote computing 9/18/2020 Supercomputing 99, Processors 18

LEGION • • Worldwide virtual computer Middleware that connects computer resources http: //legion. virginia.

LEGION • • Worldwide virtual computer Middleware that connects computer resources http: //legion. virginia. edu Used e. g. at Do. D, NASA, NPACI – (NPACI: National Partnership for Advanced Computational Infrastructure) 9/18/2020 Supercomputing 99, Processors 19

Legion status monitor (Java) 9/18/2020 Supercomputing 99, Processors 20

Legion status monitor (Java) 9/18/2020 Supercomputing 99, Processors 20

Batch systems • PBS (Portable Batch system) – developed at NASA/Ames – mature public

Batch systems • PBS (Portable Batch system) – developed at NASA/Ames – mature public domain batch system focused on parallel computing (sophisticated job scheduling) – no AFS support • LSF (Platform Computing) – the market leader in US • Codine (Gridware, formerly Genias, Chord) 9/18/2020 Supercomputing 99, Processors 21

Batch system genealogy 9/18/2020 Supercomputing 99, Processors 22

Batch system genealogy 9/18/2020 Supercomputing 99, Processors 22

Linux Clusters 9/18/2020 Supercomputing 99 23

Linux Clusters 9/18/2020 Supercomputing 99 23

Linux and SGI • • Demo of an Itanium cluster running Linux Statement to

Linux and SGI • • Demo of an Itanium cluster running Linux Statement to release software in the public domain Committed to Linux SGI Linux high performance Clusters – Beowulf style – Advanced Cluster Environment – new product line SGI 1400 (PIII, Redhat 6) 9/18/2020 Supercomputing 99, Processors 24

Linux and IBM • Committed to Linux as well • Hardware cluster activities (see

Linux and IBM • Committed to Linux as well • Hardware cluster activities (see e. g. next slide) – web servers (Netfinity servers) • Involvement in Software – mainly focused on desktop and web (Lotus Domino, Websphere, DB 2, Via. Voice) 9/18/2020 Supercomputing 99, Processors 25

Large Clusters • Chiba City – built by ANL, IBM and VALinux – 256

Large Clusters • Chiba City – built by ANL, IBM and VALinux – 256 Dual Pentium beowulf system • Product “Cluster City” from VA Linux – comes with VACM management software(GPL) – allows complete remote access to all resources 9/18/2020 Supercomputing 99, Processors 26

Other topics 9/18/2020 Supercomputing 99 27

Other topics 9/18/2020 Supercomputing 99 27

Top 500 Supercomputers • • 1. Sandia National Labs (ASCI Red) 9632 Intel Proc.

Top 500 Supercomputers • • 1. Sandia National Labs (ASCI Red) 9632 Intel Proc. 2. 3 TFlops 2. Lawrence Livermore (ASCI Blue) 5808 IBM 604 e 2. 1 TFlops 3. Los Alamos (ASCI Blue Mntn) 48 Origin 2000/128 1. 6 TFlops 5. Uni Tokio Hitachi SR 8000 128 Proc 873 GFlops 9. Deutscher Wetterdienst Offenbach 812 Proc T 3 E 671 GFlops 20. FZ Juelich 540 Processor T 3 E 1200 448 GFlops 500. USA(Banking) Sun HPC 10000 48 Proc. 33 GFlops 9/18/2020 Supercomputing 99, Processors 28

Supercomputer Trends • • • Strong influence of ASCI US: Europe: Japan ~ 4:

Supercomputer Trends • • • Strong influence of ASCI US: Europe: Japan ~ 4: 2: 1 Doubled speed every 1. 2 years (other: 1. 6 years) Increasing number of commercial installations Increasing role of cluster solutions Constant number of vector computers 9/18/2020 Supercomputing 99, Processors 29

Zero administration terminal • Login facility at conference provided by Sun • Equipped with

Zero administration terminal • Login facility at conference provided by Sun • Equipped with Sunray 1 – Terminal is basically a remote framebuffer – Virtual framebuffer on server is sent to terminal – operated on separate Ethernet – smartcard with “hot desking”, “plug and work” • Cost ~1000 DM + Monitor + 1/25 Sun. Server • Problem: Security and maintenance on server 9/18/2020 Supercomputing 99, Processors 30

Education Program Keynote Address State of the Field Talks Invited Talks Technical Papers Tutorials

Education Program Keynote Address State of the Field Talks Invited Talks Technical Papers Tutorials Awards Panels Compilers Grid Computing High-Performance Networking Industrial and Commercial Applications I/O Low-Level Architecture MPI Non-Numerical Algorithms Ocean and Climate Performance Profiling Scheduling Scientific Applications Special Purpose Systems Visualization Wide Area Applications Fernbach Award and Gordon Bell Finalists Research Exhibits Industry Exhibits Exhibitor Forum HPC Games Posters SCinet 99 Webcasts Birds-of-a-Feather (BOFS) 9/18/2020 Supercomputing 99 31

9/18/2020 Supercomputing 99, Processors 32

9/18/2020 Supercomputing 99, Processors 32

Processor Architectures 9/18/2020 Supercomputing 99 33

Processor Architectures 9/18/2020 Supercomputing 99 33

COMPAQ ALPHA Alpha Roadmap Lower Cost Higher Performance 0. 5 mm EV 5/333 21164

COMPAQ ALPHA Alpha Roadmap Lower Cost Higher Performance 0. 5 mm EV 5/333 21164 0. 35 mm 0. 18 mm EV 6/575 21264 EV 7/1000 21364 0. 35 mm 0. 13 mm EV 8 0. 28 mm EV 56/600 21164 EV 67/750 21264 0. 35 mm . . . 0. 18 mm PCA 56/533 21164 PC EV 68/1000 21264 0. 28 mm PCA 57/600 21164 PC 1995 9/18/2020 1996 1997 1998 1999 Supercomputing 99, Processors 2000 2001 34

COMPAQ ALPHA FETCH Stage: 0 Branch Predictors 21364 Core MAP 1 2 QUEUE 3

COMPAQ ALPHA FETCH Stage: 0 Branch Predictors 21364 Core MAP 1 2 QUEUE 3 REG 4 EXEC 5 Int Reg Map Int Issue Queue (20) Reg File (80) Exec 80 in-flight instructions plus 32 loads and 32 stores Next-Line Address L 1 Ins. Cache 64 KB 2 -Set 9/18/2020 Reg File (80) Exec DCACHE 6 Addr Exec L 1 Data Cache 64 KB 2 -Set L 2 cache 1. 5 MB 6 -Set 4 Instructions / cycle FP Reg Map FP Issue Queue (15) Reg File (72) FP ADD Div/Sqrt Supercomputing 99, Processors FP MUL Victim Buffer Miss Address 35

COMPAQ ALPHA Integrated Memory Controller • Direct RAMbus – High data capacity per pin

COMPAQ ALPHA Integrated Memory Controller • Direct RAMbus – High data capacity per pin – 800 MHz operation – 30 ns CAS latency pin to pin • • 9/18/2020 6 GB/sec read or write bandwidth 100 s of open pages Directory based cache coherence ECC SECDED Supercomputing 99, Processors 36

COMPAQ ALPHA Integrated Network Interface • Direct processor-to-processor interconnect • 10 GB/second per processor

COMPAQ ALPHA Integrated Network Interface • Direct processor-to-processor interconnect • 10 GB/second per processor • 15 ns processor-to-processor latency • Out-of-order network with adaptive routing • Asynchronous clocking between processors • 3 GB/second I/O interface per processor 9/18/2020 Supercomputing 99, Processors 37

COMPAQ ALPHA 21364 System Block Diagram M 364 IO IO M M 364 364

COMPAQ ALPHA 21364 System Block Diagram M 364 IO IO M M 364 364 IO IO M M 364 IO 9/18/2020 M 364 IO Supercomputing 99, Processors IO 38

IBM Power 4 9/18/2020 Supercomputing 99, Processors 39

IBM Power 4 9/18/2020 Supercomputing 99, Processors 39

IBM Power 4 9/18/2020 Supercomputing 99, Processors 40

IBM Power 4 9/18/2020 Supercomputing 99, Processors 40

IBM Power 4 9/18/2020 Supercomputing 99, Processors 41

IBM Power 4 9/18/2020 Supercomputing 99, Processors 41

Intel Itanium (IA 64) 9/18/2020 Supercomputing 99, Processors 42

Intel Itanium (IA 64) 9/18/2020 Supercomputing 99, Processors 42

Intel Itanium 9/18/2020 Supercomputing 99, Processors 43

Intel Itanium 9/18/2020 Supercomputing 99, Processors 43

Special Purpose Systems 9/18/2020 Supercomputing 99 44

Special Purpose Systems 9/18/2020 Supercomputing 99 44

Data-Intensi. Ve Architecture DIVA Mary Hall, Peter Kogge*, Jeff Koller, Pedro Diniz, Jacqueline Chame,

Data-Intensi. Ve Architecture DIVA Mary Hall, Peter Kogge*, Jeff Koller, Pedro Diniz, Jacqueline Chame, Jeff Draper, Jeff La. Coss, John Granacki, Jay Brockman*, Apoorv Srivastava, William Athas, Vincent Freeh*, Jaewook Shin, Joonseok Park USC Information Sciences Institute * University of Notre Dame Marina del Rey, CA 90292 Notre Dame, IN 46556 Processing-in-memory (PIM) chips that integrate processor logic into memory devices offer a new opportunity for bridging the growing gap between processor and memory speeds, especially for applications with high memory-bandwidth requirements 9/18/2020 . Supercomputing 99, SPECIAL PURPOSE SYSTEMS 45

MOE Molecular Orbital calculation Engine 9/18/2020 Koji Hashimoto et al. Kyushu University, Fuji Xerox

MOE Molecular Orbital calculation Engine 9/18/2020 Koji Hashimoto et al. Kyushu University, Fuji Xerox Co. , Ltd, Pharmaceutical Co. , Ltd, Hokkaido University of Education, Shimane University, National Institute for Advanced Interdisciplinary Research Supercomputing 99, SPECIAL PURPOSE SYSTEMS 46

GRAvity Pip. E GRAPE GORDON BELL PRIZE winner $7. 0/Mflops Astrophysical N -Body Simulation

GRAvity Pip. E GRAPE GORDON BELL PRIZE winner $7. 0/Mflops Astrophysical N -Body Simulation with Treecode on GRAPE-5 Atsushi Kawai, Toshiyuki Fukushige and Junichiro Makino University of Tokyo 9/18/2020 Supercomputing 99, SPECIAL PURPOSE SYSTEMS 47

GRAvity Pip. E GRAPE 9/18/2020 GORDON BELL PRIZE winner Supercomputing 99, SPECIAL PURPOSE SYSTEMS

GRAvity Pip. E GRAPE 9/18/2020 GORDON BELL PRIZE winner Supercomputing 99, SPECIAL PURPOSE SYSTEMS 48

High-speed Interconnect 9/18/2020 Supercomputing 99 49

High-speed Interconnect 9/18/2020 Supercomputing 99 49

Gigabyte System Network GSN Physical Layer (HIPPI-6400 -PH) – – – – 9/18/2020 6400

Gigabyte System Network GSN Physical Layer (HIPPI-6400 -PH) – – – – 9/18/2020 6400 Mbits (800 Mbytes)/sec bandwidth Independent full speed, half duplex channels 4 Virtual Circuits (multiplexing facility) Small (32 byte) fixed size micropacket Credit-based flow control End-to-end & Link-to-link checksums Automatic retransmit to correct flawed data Support for legacy HIPPI-800 traffic Supercomputing 99, High speed interconnect 50

GSN SUMAC Chip Silicon Graphics SUMAC TM ASIC AC IC 9/18/2020 GSN SRC –

GSN SUMAC Chip Silicon Graphics SUMAC TM ASIC AC IC 9/18/2020 GSN SRC – 32. 5 x 32. 5 mm 624 pin ceramic CGA – AC GSN port – IC 2 x 64 bit 100 MHz host interface – 1. 25 million cells – 17 watts – Available now DST Supercomputing 99, High speed interconnect 51

Infiniband: Future I/O + Next Generation I/O (NGIO) 9/18/2020 Supercomputing 99, High-speed interconnect 52

Infiniband: Future I/O + Next Generation I/O (NGIO) 9/18/2020 Supercomputing 99, High-speed interconnect 52

Infiniband 9/18/2020 Supercomputing 99, High-speed interconnect 53

Infiniband 9/18/2020 Supercomputing 99, High-speed interconnect 53

Myrinet 9/18/2020 Supercomputing 99, High-speed interconnect 54

Myrinet 9/18/2020 Supercomputing 99, High-speed interconnect 54

Myrinet 9/18/2020 Supercomputing 99, High-speed interconnect 55

Myrinet 9/18/2020 Supercomputing 99, High-speed interconnect 55

Myrinet 9/18/2020 Supercomputing 99, High-speed interconnect 56

Myrinet 9/18/2020 Supercomputing 99, High-speed interconnect 56

Myrinet 9/18/2020 Supercomputing 99, High-speed interconnect 57

Myrinet 9/18/2020 Supercomputing 99, High-speed interconnect 57

Myrinet 9/18/2020 Supercomputing 99, High-speed interconnect 58

Myrinet 9/18/2020 Supercomputing 99, High-speed interconnect 58

ATOLL 9/18/2020 Supercomputing 99, High-speed interconnect 59

ATOLL 9/18/2020 Supercomputing 99, High-speed interconnect 59

ATOLL 9/18/2020 Supercomputing 99, High-speed interconnect 60

ATOLL 9/18/2020 Supercomputing 99, High-speed interconnect 60

ATOLL 9/18/2020 Supercomputing 99, High-speed interconnect 61

ATOLL 9/18/2020 Supercomputing 99, High-speed interconnect 61

9/18/2020 Supercomputing 99, Processors 62

9/18/2020 Supercomputing 99, Processors 62