OU Supercomputing Center for Education Research Henry Neeman

  • Slides: 39
Download presentation
OU Supercomputing Center for Education & Research Henry Neeman, Director OU Supercomputing Center for

OU Supercomputing Center for Education & Research Henry Neeman, Director OU Supercomputing Center for Education & Research OU Information Technology University of Oklahoma OU Supercomputing Center for Education & Research

People OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August

People OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 2

Things OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August

Things OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 3

What is Supercomputing? Supercomputing is the biggest, fastest computing right this minute. Likewise, a

What is Supercomputing? Supercomputing is the biggest, fastest computing right this minute. Likewise, a supercomputer is one of the biggest, fastest computers right this minute. So, the definition of supercomputing is constantly changing. Rule of Thumb: a supercomputer is typically at least 100 times as powerful as a PC. Jargon: supercomputing is also called High Performance Computing (HPC). OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 4

What is Supercomputing About? Size Speed OU Supercomputing Center for Education & Research One.

What is Supercomputing About? Size Speed OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 5

What is Supercomputing About? n n Size: many problems that are interesting to scientists

What is Supercomputing About? n n Size: many problems that are interesting to scientists and engineers can’t fit on a PC – usually because they need more than a few GB of RAM, or more than a few 100 GB of disk. Speed: many problems that are interesting to scientists and engineers would take a very long time to run on a PC: months or even years. But a problem that would take a month on a PC might take only a few hours on a supercomputer. OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 6

What is HPC Used For? n Simulation of physical phenomena, such as n n

What is HPC Used For? n Simulation of physical phenomena, such as n n Data mining: finding needles of information in a haystack of data, such as n n Weather forecasting [1] Galaxy formation Oil reservoir management Moore, OK Tornadic Storm Gene sequencing May 3 1999[2] Signal processing Detecting storms that could produce tornados Visualization: turning a vast sea of data into pictures that a scientist can understand [3] OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 7

What is OSCER? n n n Multidisciplinary center Division of OU Information Technology Provides:

What is OSCER? n n n Multidisciplinary center Division of OU Information Technology Provides: n n Supercomputing education Supercomputing expertise Supercomputing resources: hardware, storage, software For: n n n Undergrad students Grad students Staff Faculty Their collaborators (including off campus) OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 8

Who is OSCER? Academic Depts n n n Aerospace & Mechanical Engr Biochemistry &

Who is OSCER? Academic Depts n n n Aerospace & Mechanical Engr Biochemistry & Molecular Biology Biological Survey Botany & Microbiology Chemical, Biological & Materials Engr Chemistry & Biochemistry Civil Engr & Environmental Science Computer Science Economics Electrical & Computer Engr Finance History of Science n n n Industrial Engr Geography Geology & Geophysics Library & Information Studies Mathematics Meteorology Petroleum & Geological Engr Physics & Astronomy Radiological Sciences Surgery Zoology More than 150 faculty & staff in 23 depts in Colleges of Arts & Sciences, Business, Engineering, Geosciences and Medicine – with more to come! OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 9

Who is OSCER? Organizations n n n Advanced Center for Genome Technology Center for

Who is OSCER? Organizations n n n Advanced Center for Genome Technology Center for Analysis & Prediction of Storms Center for Aircraft & Systems/Support Infrastructure Cooperative Institute for Mesoscale Meteorological Studies Center for Engineering Optimization OU Information Technology Fears Structural Engineering Laboratory Geosciences Computing Network Great Plains Network Human Technology Interaction Center Institute of Exploration & Development Geosciences n n n n Instructional Development Program Laboratory for Robotic Intelligence and Machine Learning Langston University Mathematics Dept Microarray Core Facility National Severe Storms Laboratory NOAA Storm Prediction Center OU Office of the VP for Research Oklahoma Climatological Survey Oklahoma EPSCo. R Oklahoma Medical Research Foundation Oklahoma School of Science & Math St. Gregory’s University Physics Dept Sarkeys Energy Center Sasaki Applied Meteorology Research Institute OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 10

Biggest Consumers n n Center for Analysis & Prediction of Storms: daily real time

Biggest Consumers n n Center for Analysis & Prediction of Storms: daily real time weather forecasting Oklahoma Center for High Energy Physics: particle physics simulation and data analysis using Grid computing OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 11

Who Are the Users? Over 225 users so far: n over 50 OU faculty

Who Are the Users? Over 225 users so far: n over 50 OU faculty n over 50 OU staff n over 100 students n about 20 off campus users n … more being added every month. Comparison: National Center for Supercomputing Applications (NCSA), after 20 years of history and hundreds of millions in expenditures, has about 2100 users. * * Unique usernames on cu. ncsa. uiuc. edu and tungsten. ncsa. uiuc. edu OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 12

What Does OSCER Do? Teaching Science and engineering faculty from all over America learn

What Does OSCER Do? Teaching Science and engineering faculty from all over America learn supercomputing at OU by playing with a jigsaw puzzle (NCSI @ OU 2004). OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 13

What Does OSCER Do? Rounds OU undergrads, grad students, staff and faculty learn how

What Does OSCER Do? Rounds OU undergrads, grad students, staff and faculty learn how to use supercomputing in their specific research. OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 14

Current OSCER Hardware n Aspen Systems Pentium 4 Xeon 32 -bit Linux Cluster n

Current OSCER Hardware n Aspen Systems Pentium 4 Xeon 32 -bit Linux Cluster n n Aspen Systems Itanium 2 cluster n n n 66 Itanium 2 CPUs, 132 GB RAM, 264 GFLOPs IBM Regatta p 690 Symmetric Multiprocessor n n 270 Pentium 4 Xeon CPUs, 270 GB RAM, 1. 08 TFLOPs 32 POWER 4 CPUs, 32 GB RAM, 140. 8 GFLOPs IBM FASt. T 500 Fiber. Channel-1 Disk Server Qualstar TLS-412300 Tape Library OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 15

Coming OSCER Hardware (2005) n NEW! Dell Pentium 4 Xeon 64 -bit Linux Cluster

Coming OSCER Hardware (2005) n NEW! Dell Pentium 4 Xeon 64 -bit Linux Cluster n n Aspen Systems Itanium 2 cluster n n 66 Itanium 2 CPUs, 132 GB RAM, 264 GFLOPs COMING! 2 x 16 -way Opteron Cluster n n 1024 Pentium 4 Xeon CPUs, 2240 GB RAM, 6. 55 TFLOPs 16 AMD Opteron CPUs, 96 GB RAM, 128 GFLOPs COMING! Condor Pool: 750 student lab PCs COMING! National Lambda Rail Qualstar TLS-412300 Tape Library OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 16

Hardware: IBM p 690 Regatta 32 POWER 4 CPUs (1. 1 GHz) 32 GB

Hardware: IBM p 690 Regatta 32 POWER 4 CPUs (1. 1 GHz) 32 GB RAM 218 GB internal disk OS: AIX 5. 1 Peak speed: 140. 8 GFLOP/s* Programming model: shared memory multithreading (Open. MP) (also supports MPI) *GFLOP/s: billion floating point operations per second sooner. oscer. ou. edu OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 17

Hardware: Pentium 4 Xeon Cluster 270 Pentium 4 Xeon. DP CPUs 270 GB RAM

Hardware: Pentium 4 Xeon Cluster 270 Pentium 4 Xeon. DP CPUs 270 GB RAM ~10, 000 GB disk OS: Red Hat Linux Enterprise 3 Peak speed: 1. 08 TFLOP/s* Programming model: distributed multiprocessing (MPI) *TFLOP/s: trillion floating point operations per second boomer. oscer. ou. edu OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 18

Hardware: Itanium 2 Cluster 56 Itanium 2 1. 0 GHz CPUs 112 GB RAM

Hardware: Itanium 2 Cluster 56 Itanium 2 1. 0 GHz CPUs 112 GB RAM 5, 774 GB disk OS: Red Hat Linux Enterprise 3 Peak speed: 224 GFLOP/s* Programming model: distributed multiprocessing (MPI) *GFLOP/s: billion floating point operations per second schooner. oscer. ou. edu OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 19

New! Pentium 4 Xeon Cluster 1, 024 Pentium 4 Xeon CPUs 2, 240 GB

New! Pentium 4 Xeon Cluster 1, 024 Pentium 4 Xeon CPUs 2, 240 GB RAM 20, 000 GB disk Infiniband & Gigabit Ethernet OS: Red Hat Linux Enterp 3 Peak speed: 6. 5 TFLOPs* Programming model: distributed multiprocessing (MPI) *TFLOPs: trillion calculations per sec www. top 500. org topdawg. oscer. ou. edu DEBUTED AT #54 WORLDWIDE, #9 AMONG US UNIVERSITIES, #4 EXCLUDING BIG 3 NSF CENTERS OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 20

Coming! National Lambda Rail The National Lambda Rail (NLR) is the next generation of

Coming! National Lambda Rail The National Lambda Rail (NLR) is the next generation of high performance networking. OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 21

Coming! Condor Pool Condor is a software package that allows number crunching jobs to

Coming! Condor Pool Condor is a software package that allows number crunching jobs to run on idle desktop PCs. OU IT is deploying a large Condor pool (750 desktop PCs) over the course of the Spring 2005. When deployed, it’ll provide a huge amount of additional computing power – more than is currently available in all of OSCER today. And, the cost is very low. OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 22

What is Condor? Condor is grid computing technology: n it steals compute cycles from

What is Condor? Condor is grid computing technology: n it steals compute cycles from existing desktop PCs; n it runs in background when no one is logged in. Condor is like SETI@home, but better: n it’s general purpose and can work for any “loosely coupled” application; n it can do all of its I/O over the network, not using the desktop PC’s disk; n it can use academic research community’s Grid middleware such as Globus, but it doesn’t have to. OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 23

Supercomputing at Night Desktop PCs tend to be very active during the workday. But

Supercomputing at Night Desktop PCs tend to be very active during the workday. But at night, during most of the year, they’re idle. So they can be used for number crunching. OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 24

Condor is like SETI@home: n it steals compute cycles from existing desktop PCs; n

Condor is like SETI@home: n it steals compute cycles from existing desktop PCs; n it runs in background when no one is sitting at the desk. Condor is better than SETI@home: n it’s general purpose and can work for any loosely coupled application; n it can do all of its I/O over the network, not using the desktop PC’s local disk. OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 25

How is Condor Helpful? n n n Low cost: either n nothing (if you

How is Condor Helpful? n n n Low cost: either n nothing (if you have many Linux PCs), OR n very little (if you have many Windows PCs). Repurpose idle time on existing desktop PCs – you’ve already paid for the hardware, and most of the software is FREE! Enable research that involves lots of computing. Share multiple independent Condor pools among various institutions (flocking). It provides a quick Grid computing resource, to get regional campuses used to Grids. OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 26

Windows vs. Linux n n n Condor runs best on Unix flavors (e. g.

Windows vs. Linux n n n Condor runs best on Unix flavors (e. g. , Linux). The Windows version is clipped: it doesn’t support automatic checkpointing or automatic job migration. However, desktop PC users demand a pure Windows experience. So, to use Condor in OU IT student PC labs, we need Windows and Linux at the same time. Solution: VMware OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 27

VMware n n n VMware is virtual machine software that allows a host OS

VMware n n n VMware is virtual machine software that allows a host OS and one or more guest OSes. VMware is commercial software, not free. For Condor, OU runs Linux as the host OS and Windows as the guest OS. This way, Condor can run directly on Linux and therefore have the maximal set of features, but desktop users can have a pure Windows experience. The lab PCs are configured to boot directly to the Windows login page, so desktop users never know. OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 28

Linux/VMware/Windows/Condor Science Applications Desktop Applications Windows VMware Condor Linux OU Supercomputing Center for Education

Linux/VMware/Windows/Condor Science Applications Desktop Applications Windows VMware Condor Linux OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 29

But I Don’t Want to Manage Linux! Science Applications Desktop Applications Condor VMware Windows

But I Don’t Want to Manage Linux! Science Applications Desktop Applications Condor VMware Windows OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 30

Condor Setup at OU Condor Users Login Nodes Firewall Device Mgmt Nodes Lab PCs

Condor Setup at OU Condor Users Login Nodes Firewall Device Mgmt Nodes Lab PCs OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 31

Current Status at OU n n Pool of test machines in dorm PC lab

Current Status at OU n n Pool of test machines in dorm PC lab Submit/management from Neeman’s desktop PC Rollout to multiple labs during summer Total rollout to 750 PCs by end of 2005 OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 32

What Does OU Plan to Do w/Condor? Loosely coupled problems: many small, independent jobs

What Does OU Plan to Do w/Condor? Loosely coupled problems: many small, independent jobs n High Energy Physics: D-Zero experiment n Nanotechnology: Monte Carlo simulation n Computational Chemistry: molecular dynamics n Aerospace Engineering: parameter space searches n. . . and many others. OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 33

NSF CI-TEAM Program n n n The NSF Cyberinfrastructure TEAM program is a brand

NSF CI-TEAM Program n n n The NSF Cyberinfrastructure TEAM program is a brand new program. It is providing grants of up to $250, 000 for up to 2 years. One of CI-TEAM’s goals is to expand Cyberinfrastructure – for example, supercomputing – to institutions and people that traditionally haven’t had much access. OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 34

Our NSF CI-TEAM Project The University of Oklahoma (OU) is leading an NSF CI-TEAM

Our NSF CI-TEAM Project The University of Oklahoma (OU) is leading an NSF CI-TEAM project, submitted May 27 2005. The focus is on setting up Condor pools across the Great Plains region, and beyond. The kickstart application will be BLAST (bioinformatics), but these Condor pools will be available for any appropriate application. Most of the money in OU’s CI-TEAM proposal will go to institutions other than OU, for VMware licenses and PCs to manage the Condor pools. OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 35

CI-TEAM Participants So Far n At OU n n n OSCER/IT Arts & Sciences:

CI-TEAM Participants So Far n At OU n n n OSCER/IT Arts & Sciences: Botany & Microbiology; Chemistry & Biochemistry; Mathematics; Physics & Astronomy; Zoology Engineering: Aerospace & Mechanical Engineering; Civil Engineering & Environmental Science; Chemical, Biological & Materials Engineering; Computer Science; Electrical & Computer Engineering, Industrial Engineering Medicine: Surgery, Radiological Sciences Other Academic Institutions in Oklahoma: Langston U. (minority serving), Oklahoma Baptist U. (4 year), Oklahoma School of Science & Mathematics (high school), St. Gregory’s U. (4 year), U. Central Oklahoma (Masters-granting) Academic Institutions outside Oklahoma: Contra Costa College of CA (2 year), Kansas State U. (Ph. D), U. Arkansas Fayetteville (Ph. D), U. Arkansas Little Rock (Ph. D), U. Kansas (Ph. D), U. Nebraska (Ph. D), U. Northern Iowa (Masters) OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 36

An Added Bonus OSCER’s user eligibility policy is that anyone can have access to

An Added Bonus OSCER’s user eligibility policy is that anyone can have access to OSCER’s supercomputers, if they are on a project that has an OU faculty or staff member as the Principal or Co-Principal Investigator. So, as a bonus, everyone at participating institutions who is on the CI-TEAM project – and their students – will get FREE access to OSCER’s supercomputers! OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 37

Join our CI-TEAM! n n If OU’s CI-TEAM proposal is fully funded, it will

Join our CI-TEAM! n n If OU’s CI-TEAM proposal is fully funded, it will create one of largest Condor flocks in the world: almost 4, 000 PCs. YOU CAN JOIN US! All you need is to commit some PCs – you create your own Condor pool with OU’s help, and then let the rest of the CI-TEAM institutions use them. CI-TEAM will pay for the VMware licenses, and for small schools also a PC as Condor manager. OU Supercomputing Center for Education & Research One. Net CIO Meeting Thursday August 11 2005 38

QUESTIONS? Thank you for your attention. OU Supercomputing Center for Education & Research

QUESTIONS? Thank you for your attention. OU Supercomputing Center for Education & Research