Introduction to Boston Universitys Shared Computing Cluster SCC
Introduction to Boston University’s Shared Computing Cluster (SCC) Aaron D. Fuegi aarondf@bu. edu Research Computing Services Information Services & Technology Boston University
Information Services & Technology Outline § What is the Shared Computing Cluster (SCC)? § Getting an Account on the SCC § Connecting to the SCC § Using the SCC (Hands-On) § Questions? 9/12/2021
Information Services & Technology 9/12/2021 Motivations For Using The SCC § Researchers need to: § Collaborate with colleagues on shared data. § Run code that exceeds workstation capability (RAM, Network, Disk). § Run code that runs for long periods of time (hours, days, weeks) § Run code in highly parallelized formats (use 100 machines simultaneously). § Might want to do all of those things 1000 times. § Access specialized software packages. 3
Information Services & Technology 9/12/2021 What Is The SCC? § A Linux cluster with over 690 nodes, 14, 000 processors, and 324 GPUs. Currently over 3. 5 Petabytes of disk. § Located in Holyoke, MA at the Massachusetts Green High Performance Computing Center (MGHPCC), a collaboration between 5 major universities and the Commonwealth of Massachusetts. § Went into production in June, 2013 for Research Computing. Continues to be updated/expanded. § http: //www. bu. edu/tech/support/research/computing-resources/scc/ 4
Information Services & Technology 9/12/2021 Why Holyoke? – MGHPCC Benefits § § § Green, environmentally friendly design. Low cost, clean and renewable energy source. Space on-site for building expansion (years 10 -20). Opportunities for shared facilities and services. Opportunities for collaboration with other institutions. BU “Far West” – Two 10 Gigabit/second Ethernet connections from BU to the MGHPCC. § http: //www. bu. edu/tech/support/research/rcs/mghpcc/ 5
Information Services & Technology 9/12/2021 MGHPCC - Photo 6
Information Services & Technology 9/12/2021 Service Models – Shared and Buy-In § Many of the elements of the SCC are paid for by BU and university-wide grants and are free to the entire BU Research Computing community. § Other elements (over 60% of the processors currently) are purchased by individual faculty or research groups through the Buy-In program with priority access for the purchaser. § http: //www. bu. edu/tech/support/research/computingresources/service-models/ 7
Information Services & Technology 9/12/2021 SCC Architecture Public Network SCC 1 SCC 2 Public Network GEO SCC 4 Login Nodes Private Network VPN Only >3. 5 PB File Storage Private Network Compute Nodes 8 More than 690 nodes with ~14, 000 CPUs and 324 GPUs
Information Services & Technology 9/12/2021 Storage § Research projects are by default granted 50 GB of backed-up spaced (/project/PROJNAME) and 50 GB of not-backed-up space (/projectnb/PROJNAME). These numbers can be increased for free to 200 GB/800 GB. § Project groups can either purchase or “rent” additional storage. § All users have a Home Directory with a 10 GB quota. § http: //www. bu. edu/tech/support/research/computingresources/file-storage/ 9
Information Services & Technology 9/12/2021 Storage Space (in GBs) Home Directory Projectnb Project Stash 0 100 200 Default Size 300 400 500 Maximum (free) Size 600 700 800 900 1000 Expansion ($$) 10
Information Services & Technology 9/12/2021 Storage – What files should go where? § Home Directory – Personal files, custom scripts. § /project – Source code, files you can’t replace. § /projectnb – Output files, downloaded data sets. Large quantities of data that you could recreate in the incredibly unlikely event of a disastrous data loss. § /stash – Manual backup of vital /projectnb data. 11
Information Services & Technology 9/12/2021 Storage - Restricted (db. Ga. P) Data § Some projects, mostly those on the BU Medical Campus, require db. Ga. P security measures: § /restricted/project/PROJNAME – backed up space for db. Ga. P data § /restricted/projectnb/PROJNAME – not backed up space for db. Ga. P data § Only accessible through scc 4. bu. edu and compute nodes 12
Information Services & Technology 9/12/2021 Storage – Scratch Space § Each node (login or compute) has a directory called /scratch stored on a local hard drive. This can be used by batch jobs to quickly write temporary files. § If you wish to keep these files, you should copy them to your own space when the job completes. § Scratch files are kept for 30 days, with no guarantees. § http: //www. bu. edu/tech/support/research/systemusage/running-jobs/resources-jobs/local_scratch/ 13
Information Services & Technology 9/12/2021 Snapshots – Recovering lost files § Available for Home Directories, all Project Disk Space, and STASH. Backups made daily at Midnight. [adftest 2@scc 1 ~]$ cd. snapshots [adftest 2@scc 1 ~]$ ls 140613/ 140624/ [adftest 2@scc 1 ~]$ cd 160613 [adftest 2@scc 1 ~]$ ls –l -rw-r--r-- 1 adftest 2 scv 71 May 29 19: 41 myfile [adftest 2@scc 1 ~]$ cp myfile. . / http: //www. bu. edu/tech/support/research/computingresources/file-storage/#Snapshots 14
Information Services & Technology 9/12/2021 Accounting – CPU Hours/SUs § No monetary charges for CPU use on the SCC. § Usage is tracked in Service Units (SUs). If a project exceeds its allocation, the project leader (LPI) must submit a request for additional resources. Reports on usage are mailed out monthly to all users and project leaders. Large requests require approval by the Large Allocation Review Committee (LARC). § http: //www. bu. edu/tech/support/research/accountmanagement/manage-project/#SUS 15
Information Services & Technology 9/12/2021 Software (Tutorial this semester) § Programming Languages: C, C++, Python, CUDA, Perl, FORTRAN, § Math, Data Analysis, and Plotting: MATLAB, Mathematica, IDL, MAPLE § Statistics: R, Rstudio, SAS, Stata § Visualization: Image. J, VTK, Para. View, VMD, Maya, § Domain Specific Packages: Bioinformatics, Engineering, Geographic Information Systems (GIS) § Parallel: MPI, MATLAB PCT, Open. MP, Open. ACC § http: //rcs. bu. edu/software/ 16
Information Services & Technology 9/12/2021 GPU Computing § Fast computation using GPUs (graphics processing units). 100 x speedups possible for some codes. § 324 GPUs available. § Programming: C++ and FORTRAN - CUDA, Open. ACC § Software Packages: MATLAB PCT, R § Machine Learning & Chemistry: Some applications in these areas can quite easily take advantage of GPUs. § If interested, take one or more of our GPU tutorials. § http: //www. bu. edu/tech/support/research/software-andprogramming/multiprocessor/gpu-computing/ 17
Information Services & Technology 9/12/2021 Getting an Account on the SCC § Using tutorial accounts today. These should not be used after today. § All users of the SCC must be on a Research Project headed up by a full-time BU Faculty member. § Exception: 3 month trial accounts for students/tutorial attendees. Email help@scc. bu. edu if interested. § http: //www. bu. edu/tech/support/research/accountmanagement/ 18
Information Services & Technology 9/12/2021 Alternative: Linux Virtual Lab § Available to any BU community member that needs access to a Linux system. Send email to ithelp@bu. edu to get access. § Advantages: § Permanent account § Full access to SCC software via scc-lite. bu. edu § Disadvantages: § No batch system access § Limited disk space § http: //www. bu. edu/tech/services/support/desktop/computerlabs/unix/ 19
Information Services & Technology 9/12/2021 Connecting to the SCC § Windows - Moba. Xterm http: //www. bu. edu/tech/support/research/systemusage/getting-started/connect-ssh/#windows § Macintosh – Built-in Terminal application http: //www. bu. edu/tech/support/research/systemusage/getting-started/connect-ssh/#apple § Linux – Terminal application http: //www. bu. edu/tech/support/research/systemusage/getting-started/connect-ssh/#linux 20
Information Services & Technology 9/12/2021 Connecting - Details § Software you need: § SSH Client – To log in to the SCC machines, such as scc 1. bu. edu and then run commands § X Forwarding – Display graphics for those programs with a GUI interface (such as MATLAB) or that otherwise display images. § File Transfer – Transferring files between the SCC and your local machine using SFTP. § VNC – Advanced users only. Faster graphics: http: //www. bu. edu/tech/support/research/systemusage/getting-started/remote-desktop-vnc/ 21
Information Services & Technology 9/12/2021 Questions so Far § Questions on the Shared Computing Cluster so far? § Remainder of the tutorial will be hands-on getting a feel for using Linux and the SCC. § If you are already familiar with Linux, this section may be slow for you. 22
Information Services & Technology 9/12/2021 Using the SCC (Hands-On) § Linux “Command Line” Environment – No menus or graphics unless in specific software packages. § Login Nodes - Interactive use, code development. General: scc 1. bu. edu, scc 2. bu. edu Earth & Environment Dept. Users: geo. bu. edu BUMC and Restricted Data Users: scc 4. bu. edu § Compute Nodes – Run “Batch Jobs” on, both single and multi-processor. Names like scc-bc 5. bu. edu 23
Information Services & Technology 9/12/2021 Using the SCC - Basics § This tutorial is going to cover the very basics of Linux on the SCC. Please consider taking a fuller Linux tutorial from us or online if you end up using the SCC significantly. § We have on our web site some material for new users of Linux and the SCC at: http: //www. bu. edu/tech/support/research/systemusage/getting-started/commands/ 24
Information Services & Technology 9/12/2021 Using the SCC – ssh § From your ssh/terminal application on your tutorial workstation or your laptop or on a machine at home: ssh -l adftest 2 scc 1. bu. edu § “ssh” is the command you are issuing § “-l adftest 2” is a “command line option” to specify your login name on the SCC § “scc 1. bu. edu” is a “parameter” of the command § Make sure to hit the “Enter” key after every command 25
Information Services & Technology 9/12/2021 Using the SCC - Logging In Windows/Moba. Xterm local_prompt% ssh adftest 2@scc 1. bu. edu Mac local_prompt% ssh –Y adftest 2@scc 1. bu. edu Linux local_prompt% ssh –X adftest 2@scc 1. bu. edu 26
Information Services & Technology 9/12/2021 SFTP File Transfer to/from the SCC § Graphical Applications § Windows – Moba. Xterm (Free), Win. SCP (Free) § Mac – File. Zilla (Free), Fetch (BU site license) § Command Line Applications § rsync § scp § http: //www. bu. edu/tech/support/research/systemusage/getting-started/get-started-file-transfer/ 27
Information Services & Technology 9/12/2021 File Transfer Issues – dos 2 unix § Windows, Macs, and Linux in text files define “end of line” differently. To solve this issue, there is a utility called dos 2 unix. This is not an issue with binary files. § Transfer text file “example. txt” from Windows to Linux. Rewrite “example. txt” as a Linux style file. [adftest 2@scc 1 ~]$ dos 2 unix example. txt § https: //www. computerhope. com/unix/dos 2 unix. htm 28
Information Services & Technology 9/12/2021 Using the SCC – the “prompt” § You should now see something like: [adftest 2@scc 1 ~]$ § This is what is called the “prompt” and indicates the system (the bash “shell” in particular) is ready to accept commands from you. “adftest 2” is your login name. “scc 1” is the machine you are on. “~” is the directory you are in – in Linux “~” is a shorthand for a person’s home directory. 29
Information Services & Technology 9/12/2021 Using the SCC - X-Forwarding (Graphics) § Run the command xclock to see if graphics are working for you. [adftest 2@scc 1 ~]$ xclock § A window similar to the image on the right should come up. § Click the X in the upper right to close this window. § http: //www. bu. edu/tech/support/research/systemusage/getting-started/x-forwarding/ 30
Information Services & Technology 9/12/2021 Using the SCC – pwd § Show the current “full path”, the directory you are in with its parent and all levels of grandparents up to the root directory (/). Items you type will be shown in bold: [adftest 2@scc 1 ~]$ pwd /usr 2/collab/adftest 2 § Here the command pwd “returns” (prints to your screen) the result “/usr 2/collab/adftest 2” 31
Information Services & Technology 9/12/2021 Using the SCC – man § The man (short for “manual”) command is used to look up information about a Linux command. [adftest 2@scc 1 ~]$ man pwd PWD(1) User Commands PWD(1) NAME pwd - print name of current/working… SYNOPSIS pwd [OPTION]. . . … 32
Information Services & Technology 9/12/2021 Using the SCC – man cont. § For some commands, such as if you run man cd, you will get a general manual page for the bash shell and not such a particular page as for pwd. § You can page through the manual page for a command a screenful at a time using the “spacebar”, a line at a time using the “Enter” key, and quit out of the page by typing q. 33
Information Services & Technology 9/12/2021 Using the SCC – mkdir § Create a new directory: [adftest 2@scc 1 ~]$ mkdir newdir § Creates a new directory (folder) to store files in within your home directory. 34
Information Services & Technology 9/12/2021 Using the SCC – ls § List the contents of a directory: [adftest 2@scc 1 ~]$ ls newdir § Or with a command line option, asking for more details: [adftest 2@scc 1 ~]$ ls -l total 0 drwxr-xr-x 3 adftest 2 adftest 512 Oct 28 16: 03 newdir 35
Information Services & Technology 9/12/2021 Using the SCC – Users, Groups, and File Permissions § The SCC is a Multiple User System: § Many users. § Many groups/projects. § Users can belong to multiple groups. § Files Access Control: § Every file has an owner. § Every file belongs to a group. § Every file has “permissions” controlling access to it. 36
Information Services & Technology 9/12/2021 Using the SCC – File Permissions § From the previous slide: drwxr-xr-x 3 adftest 2 adftest 512 Oct 28 16: 03 newdir § “drwxr-xr-x” gives the “permissions” for this directory (or file). The “d” indicates this is a directory. There are then three sets of three characters for “user” (u), “group” (g), and “other” (o) access levels. “r” indicates a file/directory is readable, “w” writable, and “x” executable. A “-” indicates no such permission exists. 37
Information Services & Technology 9/12/2021 Using the SCC - chmod § Change the permissions on the directory “newdir” so that members of your group can write to it: [adftest 2@scc 1 ~]$ chmod g+w newdir § and note the difference: [adftest 2@scc 1 ~]$ ls -l total 0 drwxrwxr-x 3 adftest 2 adftest 512 Oct 28 16: 03 newdir 38
Information Services & Technology 9/12/2021 Using the SCC – chmod cont. § The chmod command also works with the following mappings, readable=4, writable=2, executable=1, which are combined like so: [adftest 2@scc 1 ~]$ ls –l newdir drwxrwxr-x 3 adftest 2 adftest 512 Oct 28 16: 03 newdir [adftest 2@scc 1 ~]$ chmod 750 newdir [adftest 2@scc 1 ~]$ ls -l newdir drwxr-x--- 3 adftest 2 adftest 512 … (4+2+1=7) (4+0+1=5) (0+0+0=0) 39
Information Services & Technology 9/12/2021 Using the SCC - cd § Change directory to “newdir”: [adftest 2@scc 1 ~]$ cd newdir § You can also move to other directories by giving a “full path” (a path starting with the / character) such as: [adftest 2@scc 1 newdir]$ cd /usr/local/bin/ § Type just cd anytime to go back to your home directory. 40
Information Services & Technology 9/12/2021 Using the SCC – cp (Start C Example) § We will now begin a sequence of commands to compile and run a very simple C code. We start by copying the code from our “examples” directory into the current directory, which can be abbreviated by the. (period) character: [adftest 2@scc 1 newdir]$ cp /project/scv/examples /c/examples/ex 01 -helloworld/hello. World. c. 41
Information Services & Technology 9/12/2021 Using the SCC - more § Look at the contents of the C source code file we just copied using the more command: [adftest 2@scc 1 newdir]$ more hello. World. c #include <stdio. h> int main(int argc, char *argv[]) { /* print message */ printf("Hello, World!n"); return (0); } 42
Information Services & Technology 9/12/2021 Using the SCC - gcc § Compile the source code file we just copied into the binary file hello using the Gnu C compiler gcc: [adftest 2@scc 1 newdir]$ gcc -o hello. World. c § The “-o hello” option causes the output file to be named “hello”. Without this, it would be named “a. out” regardless of the name of your source code file. 43
Information Services & Technology 9/12/2021 Using the SCC – File Execution § Note that the compiled file is automatically made “executable”: [adftest 2@scc 1 newdir]$ ls -l hello -rwxr-xr-x 1 adftest 2 adftest 6430 Oct 28 15: 49 hello § Now we run the command from the current directory: [adftest 2@scc 1 newdir]$ hello Hello, World! 44
Information Services & Technology 9/12/2021 Using the SCC – qsub and qstat § Use the Open Grid Scheduler (OGS) command qsub to submit our compiled program to the batch system: [adftest 2@scc 1 newdir]$ qsub -b y hello Your job 1041461 ("hello") has been submitted § If you are quick, you can monitor this job using qstat: [adftest 2@scc 1 newdir]$ qstat –u adftest 2 job-ID prior name user state submit/start at queue … ------------------------------------ … 1041461 0. 00000 hello adftest 2 qw 09/02/2014 11: 44: 28 … 45
Information Services & Technology 9/12/2021 Using the SCC – qsub output § The job should run soon and produce an output file: [adftest 2@scc 1 newdir]$ cat hello. o 1041461 hello, world § There will also be an error file which should be empty: [adftest 2@scc 1 newdir]$ cat hello. e 1041461 46
Information Services & Technology 9/12/2021 Using the SCC – qsub Details § Submit non-interactive batch jobs using qsub [options] command [arguments] § Setting default qsub options using a. sge_request file: http: //www. bu. edu/tech/support/research/system-usage/runningjobs/advanced-batch/#sge_request § http: //www. bu. edu/tech/support/research/systemusage/running-jobs/submitting-jobs/ 47
Information Services & Technology 9/12/2021 Using the SCC – qsub options 48
Information Services & Technology 9/12/2021 Using the SCC – qsub options cont. 49
Information Services & Technology 9/12/2021 Interactive Batch Jobs § Used for doing interactive work, such as in MATLAB, that takes more than 15 minutes of CPU time. [adftest 2@scc 1 newdir]$ qsh –l h_rt=24: 00 Your job 5274760 ("INTERACTIVE") has been submitted waiting for interactive job to be scheduled. . . Your interactive job 5274760 has been successfully scheduled. § New window comes up after a little while: [adftest 2@scc-pi 4 newdir]$ matlab -single. Comp. Thread § http: //www. bu. edu/tech/support/research/systemusage/running-jobs/interactive-jobs/ 50
Information Services & Technology 9/12/2021 Using the SCC -. bashrc file § You have a. bashrc (. cshrc for tcsh users) in your home directory. Commands in this file are automatically executed every time you log in. Do not put commands like “echo” in this file. § Modify this file to change your default system behaviors or automatically run certain commands when you log in. § Be careful modifying this file or you could make it impossible for yourself to log in to the system; contact us if that happens. § http: //www. bu. edu/tech/support/research/systemusage/using-scc/environment/ 51
Information Services & Technology 9/12/2021 Using the SCC – gedit GUI Editor § Try launching a graphical application, such as gedit: [adftest 2@scc 1 newdir]$ gedit ~/. bashrc & § Assuming you have X Forwarding set up, this should bring up in a separate window the simple editor gedit to enable you to edit your source code file. Other editors such as emacs and vi are also available. 52
Information Services & Technology 9/12/2021 Using the SCC – Editing your. bashrc § Add the following line in your. bashrc file. alias dir=‘ls-al’ § Save the new. bashrc file and run in your shell window: [adftest 2@scc 1 newdir]$ source ~/. bashrc § Test out the new command: [adftest 2@scc 1 newdir]$ dir hello -rwxr-xr-x 1 adftest 2 adftest 6494 Jan 22 11: 22 hello 53
Information Services & Technology 9/12/2021 Using the SCC – Modules (1) § Modules – Used to load applications not automatically loaded by the system, including alternative versions of applications. § See if I have access to the envi program: [adftest 2@scc 1 newdir]$ which envi /usr/bin/which: no envi in (/usr/local/apps/pgi 13. 5/bin: /usr/java/default/jre/bin: /usr/java/default/bin: /usr/lib 64/qt 3. 3/bin: /usr/local/bin: /usr/bin: /usr/local/sbin: /usr/sbin: /usr 2/ collab/adftest 2/bin) § http: //www. bu. edu/tech/support/research/software-andprogramming/software-and-applications/modules/ 54
Information Services & Technology 9/12/2021 Using the SCC – Modules cont. (2) § See what other modules are available to me: [adftest 2@scc 1 newdir]$ moduleavail | grep envi/4. 8 envi/5. 0_sp 3 envi/5. 4 55
Information Services & Technology 9/12/2021 Using the SCC – Modules cont. (3) § Load the module I need from the earlier long list: [adftest 2@scc 1 newdir]$ module load envi/5. 4 § See if I now have access to the envi program: [adftest 2@scc 1 newdir]$ which envi: aliased to /share/pkg/envi/5. 4/install/envi 54//bin/envi 56
Information Services & Technology 9/12/2021 Using the SCC - grep § grep is a useful command for searching for a text string in a file, such as: [adftest 2@scc 1 newdir]$ grep –i hello * Binary file hello matches hello. o 1041461: hello, world hello. World. c: printf("hello, worldn"); § * is a special “wildcard” character that matches all filenames. Can also limit it by doing, for example, *. c 57
Information Services & Technology 9/12/2021 Using the SCC - Pipes § You can also create a series of commands with the output of one being the input of the next through a series of “pipes” such as: [adftest 2@scc 1 newdir]$ cat hello. World. c | grep print /* print message */ printf( "Hello, World!n" ); § You can also redirect the output of a command to a file using > myfilename 58
Information Services & Technology 9/12/2021 Additional Web Resources § Research Computing Support Pages http: //www. bu. edu/tech/support/research/ § Technical Summary of SCC Resources http: //www. bu. edu/tech/support/research/computingresources/tech-summary/ § SCC Updates – Latest SCC News http: //www. bu. edu/tech/support/research/whatshappening/updates/ § Code Examples for Popular Software Packages http: //scv. bu. edu/examples/ 59
Information Services & Technology 9/12/2021 Questions § All SCC questions welcome. Those I can’t answer I will make every effort to get you an answer for later. § Email Addresses: § Aaron Fuegi – aarondf@bu. edu § General help using the SCC – help@scc. bu. edu 60
Information Services & Technology 9/12/2021 Tutorial Survey § Please open a web browser and go to: § http: //scv. bu. edu/survey/tutorial_evaluation. html § to fill out the tutorial survey. Thanks for coming. § Tutorials slides available on the web from: http: //www. bu. edu/tech/support/research/trainingconsulting/live-tutorials/ § My Contact Information: aarondf@bu. edu or (617) 353 -8255 61
- Slides: 61