Cerberus Mass Bay Supercomputer Beowulf Cluster Project Giuseppe
“Cerberus”: Mass. Bay Supercomputer Beowulf Cluster Project Giuseppe Sena (Tony) Mass. Bay Community College
Content �Background �Justification �Work Information done: Summer & Fall 2012, Spring 2013 �Clusters �CERBERUS: Supercomputer Cluster �Future Work: 2013 -2014
CERBERUS q In Greek and Roman mythology, is a multi-headed (usually threeheaded) dog, or "hellhound" which guards the gates of the Underworld
Professional Development Grant Summer 2012 “Developing Labs and Microlabs for Adding Parallelism and Distributed Computing into Computer Science Curricula” Giuseppe Sena
Justification � For past 5 years terms like “cloud computing”, “parallel programming”, “virtual clusters”, and “distributed systems” have become mainstream in industry � Demand to teach more parallelism and distributed systems in the Computer Science (CS) curriculum
Justification (cont. ) � Universities & 4 -year colleges are adding parallelism to first-year and seconds-year CS courses � Many of community college students transfer to 4 year colleges after graduation, and they are at a disadvantage compared with students that started on a 4 -year college program
Data Parallelism Sequential Parallelism: A simplistic understanding
Parallelism: A simplistic understanding (cont. ) Q Please
Why Parallel Processing? Computation requirements are ever increasing: n n n simulations, scientific prediction (earthquake), distributed databases, weather forecasting, search engines, e-commerce, n n n Internet service applications, Data Center applications, Finance (investment risk analysis), Oil Exploration, Mining, etc.
Applications of Parallel Processing Fundamentals of Parallel Processing, Ashish Agrawal, IIT Kanpur
Summer 2012 � This project is first step in the right direction for adding parallelism and distributed computing into CS Curricula � Did survey of the literature to find out what other higher education institutions are doing to inject parallelism into their programs � Created one (1) lab and two (2) microlabs to be used for introducing parallel computing and distributed systems concepts to CS and non-CS students
Fall 2012 � We are building a simple but powerful parallel and distributed machine using commodity parts � Determining software to be installed on machine � Choosing the computational science applications as a case study to prove that the machine actually works � Selecting group members for the project
Spring 2013 � Student training in the area of Parallelism and Distributed Systems � Developing � Student recruitment for project � Organizing � Giving computational science applications workshops talks
Computer Cluster �A Cluster is a collection of machines that are interconnected using some type of network architecture � Main features: behave like a single entity � Strategy for implementing parallel processing applications � Easy to scale the system, just by adding new machines to the network
Beowulf Cluster � Beowulf Cluster is type of cluster built out of commodity of-the-shelf hardware components � Runs an Open Source Software operating system like Linux � Interconnected using private high-speed network � Dedicated computing to running high-performance
First Beowulf Cluster: 1994 Ø Developed at NASA Goddard Space Flight Center by Thomas Sterling and Donald Becker. Ø It used an early release of Linux and PVM. Ø It was made up of 16 computers Ø Intel 100 MHz 80486 -based Ø Connected by dual 10 Mbps Ethernet LAN’s Ø Project demonstrated performance/cost advantage that a Beowulf Cluster had for realworld scientific applications
Simple Beowulf Cluster Master Slaves
Data Parallelism Functional Parallelism
Domain Decomposition
Large Beowulf Cluster
Small Beowulf Cluster
Cluster Applications
Cluster Applications (cont. )
Cluster Management Tools
CERBERUS Supercomputer Cluster
Compute Node Configuration ZOTAC ZBOX ID 41 mini-PC q. Eco-friendly (energy-efficiency) q. Intel® Atom™ architecture q. Next-Generation NVIDIA® ION™ graphics processor (CUDA™) q 64 GB SSD q 4 GB RAM
CERBERUS: Supercomputer Cluster
CERBERUS: Supercomputer Cluster (cont. )
CERBERUS: Supercomputer Cluster Video GAME of LIFE § Pandemic Application § 32 MPI processes § 8 hosts § open. MPI (show it!)
Technologies Used
Build, Configure, and Testing CERBERUS Beowulf Cluster § Acquired all commodity parts § Build Cluster Case § Install and assemble all parts § Install and Test Beowulf Cluster Software: § Linux Distribution (OS) § open. MPI (Message Passing Interface)
Survey of Computations Science Applications § § § Encryption, etc. (Computer Science) Galaxies Colliding, Analyzing Genome, Biology, Physics, Math, etc. Using NVIDEA CUDA-capable (Compute Unified Device Architecture) GPU: Computer Graphics, etc.
Just Kidding! That’s It!
Future Work � Develop additional Microlabs to introduce Parallelism and Distributed System Concepts into CS Curricula � Prepare Workshops for STEM Faculty & Students on the use and development of Distributed Applications using the CERBERUS Beowulf Cluster
Future Work (cont. ) � Train Project Members and other STEM users (Faculty & Students) on the use of CERBERUS Cluster � Form multidisciplinary group (CS and non-CS) of students and faculty to do research and develop computational science applications
Future Work (cont. ) � Disseminate CERBERUS Cluster Info on MA Community Colleges � Help other MA Community Colleges to Developed Cluster technology
Future Work (cont. ) Ø Write Internal Report (MBCC report) Paper and/or Poster to be sent to conference in April 2013 and November 2013
Future Work (cont. ) � Design and Development of new Parallel Applications (computational science) � Do performance analysis, and comparison to other architectures
Contact Information Tony Sena Mass. Bay Community College Department of Computer Science gsena@massbay. edu
References q S. Moussavi and G. Sena, "All-in-One Virtualized Laboratory, " in SIGCSE 2012 Proceedings of the 43 rd ACM Technical Symposium on Computer Science Education, Raleigh, North Carolina, USA, 2012 q G. Sena, D. Megherbi and G. Isern, "Implementation of a parallel Genetic Algorithm on a cluster of workstations: Traveling Salesman Problem, a case study, " Future Generation Computer Systems, Workshop on Bio-inspired Solutions to Parallel Computing problems, vol. 17, no. 4, pp. 477 -488, January 2001. q G. Sena, "Very Large Scale Finite Difference Modeling of Seismic Waves, " Massachusetts Institute of Technology (MIT), Master's Thesis, Cambridge, MA, USA, Sep. 1994.
- Slides: 45