Sameer Shende and John Linford Para Tools Inc

  • Slides: 39
Download presentation
Sameer Shende and John Linford Para. Tools, Inc. {sameer, jlinford}@paratools. com Aug 1, 2012,

Sameer Shende and John Linford Para. Tools, Inc. {sameer, jlinford}@paratools. com Aug 1, 2012, ITL Building 8000, Salon V, ERDC, Vicksburg, MS For Official Use Only – not for secondary distribution

Outline Part I: PTools. RTE Introduction Part II: Parallel Programming Examples Part III: Interactive

Outline Part I: PTools. RTE Introduction Part II: Parallel Programming Examples Part III: Interactive Computing with IPython Notebook Part IV: Callstack Debugging with TAU For Official Use Only – not for secondary distribution 2

Part I: PTools. RTE Introduction For Official Use Only – not for secondary distribution

Part I: PTools. RTE Introduction For Official Use Only – not for secondary distribution 3

PTools. RTE Goal Provide a rich and consistent environment for parallel Python program development,

PTools. RTE Goal Provide a rich and consistent environment for parallel Python program development, deployment, and debugging on all DSRC systems. For Official Use Only – not for secondary distribution 4

PTools. RTE will make your life easier Your Application Python >= 2. 6, Num.

PTools. RTE will make your life easier Your Application Python >= 2. 6, Num. Py, SWIG, mpi 4 py Installing missing requirements can be a pain! AIX + XL Cray + PGI For Official Use Only – not for secondary distribution SGI + GNU 5

A consistent Python environment + extras Your Application Python >= 2. 6, Num. Py,

A consistent Python environment + extras Your Application Python >= 2. 6, Num. Py, SWIG, mpi 4 py Num. Py TAU SWIG IPython 2. 7. 2 Missing Libraries AIX + XL Cray + PGI SGI + GNU For Official Use Only – not for secondary distribution 6

PTools. RTE is installed on all DSRC systems For Official Use Only – not

PTools. RTE is installed on all DSRC systems For Official Use Only – not for secondary distribution 7

How do I use PTools. RTE? Just load the module: • module use $PET_HOME/pkgs/ptoolsrte/etc

How do I use PTools. RTE? Just load the module: • module use $PET_HOME/pkgs/ptoolsrte/etc • module load ptoolsrte Or source a shell script: Bash: source $PET_HOME/pkgs/ptoolsrte/etc/ptoolsrte. bashrc C Shell: source $PET_HOME/pkgs/ptoolsrte/etc/ptoolsrte. cshrc For Official Use Only – not for secondary distribution 8

Part II: Parallel Programming Examples For Official Use Only – not for secondary distribution

Part II: Parallel Programming Examples For Official Use Only – not for secondary distribution 9

Write a parallel program with mpi 4 py from mpi 4 py import MPI

Write a parallel program with mpi 4 py from mpi 4 py import MPI comm = MPI. COMM_WORLD rank = comm. Get_rank() print 'Hello from rank %d' % rank if rank == 0: data = {'a': 7, 'b': 3. 14} comm. send(data, dest=1, tag=11) elif rank == 1: data = comm. recv(source=0, tag=11) print 'Rank %d received %s' % (rank, str(data)) For Official Use Only – not for secondary distribution 10

How to run an mpi 4 py application 1. 2. 3. 4. qsub -I.

How to run an mpi 4 py application 1. 2. 3. 4. qsub -I. . . module use $PET_HOME/pkgs/ptoolsrte/etc module load ptoolsrte mpirun -np {n} python mpi 4 py_p 2 p. py For Official Use Only – not for secondary distribution 11

Use Num. Py for computation and mpi 4 py for communication import numpy as

Use Num. Py for computation and mpi 4 py for communication import numpy as np from mpi 4 py import MPI comm = MPI. COMM_WORLD rank = comm. rank my_size = size // comm. size = comm. size*my_size my_offset = rank*my_size vec = np. zeros(size) #. . . Initialize vec. . . my_M = np. zeros((my_size, size)) #. . . Initialize my_M. . . for t in xrange(iter): my_new_vec = np. inner(my_M, vec) comm. Allgather([my_new_vec, MPI. DOUBLE], [vec, MPI. DOUBLE]) For Official Use Only – not for secondary distribution 12

Write a parallel program that uses py. MPI import mpi rank = mpi. rank

Write a parallel program that uses py. MPI import mpi rank = mpi. rank size = mpi. size if rank == 0: n = int(100) mpi. bcast(n) else: n = mpi. bcast() h = 1. 0/n local_sum = 0. 0 for i in range(rank+1, n+1, size): x = h*(i-0. 5) y = 4. 0/(1. 0+x*x) local_sum += y global_sum = mpi. reduce(local_sum, mpi. SUM) if mpi. rank == 0: print "PI = %f " % (h*global_sum) For Official Use Only – not for secondary distribution 13

How to run a py. MPI application 1. 2. 3. 4. qsub -I. .

How to run a py. MPI application 1. 2. 3. 4. qsub -I. . . module use $PET_HOME/pkgs/ptoolsrte/etc module load ptoolsrte mpirun -np {n} py. MPI pympi_pi. py For Official Use Only – not for secondary distribution 14

Part III: Interactive computing with IPython Notebook For Official Use Only – not for

Part III: Interactive computing with IPython Notebook For Official Use Only – not for secondary distribution 15

IPython Notebook • • • An open-source, Python-based toolkit. Interactive scientific and parallel computing.

IPython Notebook • • • An open-source, Python-based toolkit. Interactive scientific and parallel computing. Mathematica-like, MATLAB-like. Symbolic computation. Renders in your web browser. For Official Use Only – not for secondary distribution 16

Launch IPython Notebook remotely Terminal 1 Start the notebook kinit ssh diamond 03. erdc.

Launch IPython Notebook remotely Terminal 1 Start the notebook kinit ssh diamond 03. erdc. hpc. mil module use $PET_HOME/pkgs/ptoolsrte/etc module load ptoolsrte ipython notebook --pylab=inline --no-browser. . . The IPython Notebook is running at: http: //127. 0. 0. 1: 8888. . . Use ssh to forward the notebook port Terminal 2 kinit ssh -g. N -L 8888: localhost: 8888 diamond 03. erdc. hpc. mil For Official Use Only – not for secondary distribution 17

Use IPython Notebook locally Open the forwarded port in your local web browser Standards-compliant

Use IPython Notebook locally Open the forwarded port in your local web browser Standards-compliant browser required. Firefox, Chrome, and Safari are OK. For Official Use Only – not for secondary distribution ! 18

2 D matplotlib graphics For Official Use Only – not for secondary distribution 19

2 D matplotlib graphics For Official Use Only – not for secondary distribution 19

3 D matplotlib graphics For Official Use Only – not for secondary distribution 20

3 D matplotlib graphics For Official Use Only – not for secondary distribution 20

mpl_toolkits. basemap by popular demand For Official Use Only – not for secondary distribution

mpl_toolkits. basemap by popular demand For Official Use Only – not for secondary distribution 21

Parallel computing with IPython ipcontroller ipengine For Official Use Only – not for secondary

Parallel computing with IPython ipcontroller ipengine For Official Use Only – not for secondary distribution 22

PBS scripts to start ipcontroller and ipengine on Diamond ipcontroller. pbs ipengine. pbs #PBS

PBS scripts to start ipcontroller and ipengine on Diamond ipcontroller. pbs ipengine. pbs #PBS -A ARLAP 96070 PET #PBS -q debug #PBS -l select=1: ncpus=8: mpiprocs=8 #PBS -l place=scatter: excl #PBS -l walltime=00: 30: 00 #PBS -N ipcontroller #PBS -j oe #PBS -S /bin/bash cd ${WORKDIR} #PBS -A ARLAP 96070 PET #PBS -q debug #PBS -l select=8: ncpus=8: mpiprocs=8 #PBS -l place=scatter: excl #PBS -l walltime=00: 30: 00 #PBS -N ipcontroller #PBS -j oe #PBS -S /bin/bash cd ${WORKDIR} source $PET_HOME/pkgs/ptoolsrte/etc/ptoolsrte. bashr c ipcontroller --ip=* mpiexec_mpt -np 64 ipengine --mpi=mpi 4 py For Official Use Only – not for secondary distribution 23

Start your engines 1. qsub ipcontroller. pbs 2. qsub ipengine. pbs For Official Use

Start your engines 1. qsub ipcontroller. pbs 2. qsub ipengine. pbs For Official Use Only – not for secondary distribution 24

Connect to the controller and execute For Official Use Only – not for secondary

Connect to the controller and execute For Official Use Only – not for secondary distribution 25

Part IV: Callstack Debugging with TAU For Official Use Only – not for secondary

Part IV: Callstack Debugging with TAU For Official Use Only – not for secondary distribution 26

Segfault! What do you do? Python Callpath Need to debug multi-language codes, even if

Segfault! What do you do? Python Callpath Need to debug multi-language codes, even if the input files can’t be shared. Core files aren’t much help here. Most tools are mono-lingual. C++ Fortran Use TAU! export TAU_TRACK_SIGNALS=1 For Official Use Only – not for secondary distribution 27

Real-world example: CREATE-AV Helios For Official Use Only – not for secondary distribution 28

Real-world example: CREATE-AV Helios For Official Use Only – not for secondary distribution 28

Helios: a multi-language program For Official Use Only – not for secondary distribution 29

Helios: a multi-language program For Official Use Only – not for secondary distribution 29

Rebuild with ‘-g’ to get debugging symbols Note: you can skip this step, but

Rebuild with ‘-g’ to get debugging symbols Note: you can skip this step, but your backtrace will be less informative For Official Use Only – not for secondary distribution 30

Create a wrapper file wrapper. py: import tau def Our. Main(): import samarcrun Your

Create a wrapper file wrapper. py: import tau def Our. Main(): import samarcrun Your application name here tau. run('Our. Main()') For Official Use Only – not for secondary distribution 31

Run with TAU 1. export TAU_TRACK_SIGNALS=1 2. export TAU_CALLPATH_DEPTH=100 3. mpirun -np {n} tau_exec

Run with TAU 1. export TAU_TRACK_SIGNALS=1 2. export TAU_CALLPATH_DEPTH=100 3. mpirun -np {n} tau_exec -T python py. MPI. /wrapper. py For Official Use Only – not for secondary distribution 32

TAU generates profile data on when error occurs For Official Use Only – not

TAU generates profile data on when error occurs For Official Use Only – not for secondary distribution 33

Use Para. Prof to explore profile data For Official Use Only – not for

Use Para. Prof to explore profile data For Official Use Only – not for secondary distribution 34

Right-click the thread you want to explore For Official Use Only – not for

Right-click the thread you want to explore For Official Use Only – not for secondary distribution 35

Use the Metadata window to locate code line that caused the error For Official

Use the Metadata window to locate code line that caused the error For Official Use Only – not for secondary distribution 36

Para. Prof highlights the erroneous line For Official Use Only – not for secondary

Para. Prof highlights the erroneous line For Official Use Only – not for secondary distribution 37

Summary • Use PTools. RTE for a rich and consistent Python environment on all

Summary • Use PTools. RTE for a rich and consistent Python environment on all DSRC systems • mpi 4 py, numpy, py. MPI, matplotlib, and many more packages are included • Use IPython Notebook to interact with all the features of PTools. RTE, including large-scale parallel deployment • Use TAU callstack debugging to find and fix errors For Official Use Only – not for secondary distribution 38

Acknowledgments This work was supported by the Do. D High Performance Computing Modernization Program

Acknowledgments This work was supported by the Do. D High Performance Computing Modernization Program (HPCMP) User Productivity Enhancement, Technology Transfer and Training (PETTT) program and through support provided by the Do. D HPCMO to the HIARMS Institute and the CREATE program. Special thanks to Dr. Andrew Wissink and Stephen Adamec.