Parallel HDF 5 Introductory Tutorial HDF 5 Tutorial

  • Slides: 43
Download presentation
Parallel HDF 5 Introductory Tutorial HDF 5 Tutorial June 12, 2006 Albert Cheng hdfhelp@ncsa.

Parallel HDF 5 Introductory Tutorial HDF 5 Tutorial June 12, 2006 Albert Cheng hdfhelp@ncsa. uiuc. edu Intro PHDF 5 tutorial -1 - HDF

Outline • Overview of Parallel HDF 5 design • Setting up parallel environment •

Outline • Overview of Parallel HDF 5 design • Setting up parallel environment • Programming model for – Creating and accessing a File – Creating and accessing a Dataset – Writing and reading Hyperslabs • Parallel tutorial available at – http: //hdf. ncsa. uiuc. edu/HDF 5/doc/Tutor Intro PHDF 5 tutorial -2 - HDF

Overview of Parallel HDF 5 Design Intro PHDF 5 tutorial -3 - HDF

Overview of Parallel HDF 5 Design Intro PHDF 5 tutorial -3 - HDF

PHDF 5 Requirements • Support MPI programming • PHDF 5 files compatible with serial

PHDF 5 Requirements • Support MPI programming • PHDF 5 files compatible with serial HDF 5 files – Shareable between different serial or parallel platforms • Single file image to all processes – One file per process design is undesirable • Expensive post processing • Not useable by different number of processes • Standard parallel I/O interface – Must be portable to different platforms Intro PHDF 5 tutorial -4 - HDF

PHDF 5 Implementation Layers Parallel Application HDF library Parallel HDF 5 + MPI-IO Local

PHDF 5 Implementation Layers Parallel Application HDF library Parallel HDF 5 + MPI-IO Local File System Intro PHDF 5 tutorial “NFS” User Applications Parallel I/O layer GPFS/PVFS Lustre/PFS… -5 - Parallel File systems HDF

Parallel Environment Requirements • MPI with MPI-IO – MPICH ROMIO – Vendor’s MPI-IO •

Parallel Environment Requirements • MPI with MPI-IO – MPICH ROMIO – Vendor’s MPI-IO • Parallel file system – GPFS – Lustre – PVFS – Specially configured NFS Intro PHDF 5 tutorial -6 - HDF

How to Compile PHDF 5 Applications • h 5 pcc – HDF 5 C

How to Compile PHDF 5 Applications • h 5 pcc – HDF 5 C compiler command – Similar to mpicc • h 5 pfc – HDF 5 F 90 compiler command – Similar to mpif 90 • To compile: % h 5 pcc h 5 prog. c % h 5 pfc h 5 prog. f 90 Intro PHDF 5 tutorial -7 - HDF

h 5 pcc/h 5 pfc -show option • show displays the compiler commands and

h 5 pcc/h 5 pfc -show option • show displays the compiler commands and options without executing them, i. e. , dry run • % h 5 pcc –show Sample_mpio. c – mpicc -I/afs/ncsa/projects/hdf/tst/pre-release/vdev/heping-pp/include D_LARGEFILE_SOURCE -D_LARGEFILE 64_SOURCE D_FILE_OFFSET_BITS=64 -D_POSIX_SOURCE -D_BSD_SOURCE -std=c 99 -c Sample_mpio. c – mpicc -std=c 99 Sample_mpio. o -L/afs/ncsa/projects/hdf/tst/prerelease/vdev/heping-pp/libhdf 5_hl. a /afs/ncsa/projects/hdf/tst/prerelease/vdev/heping-pp/libhdf 5. a -lz -lm -Wl, -rpath Wl, /afs/ncsa/projects/hdf/tst/pre-release/vdev/heping-pp/lib Intro PHDF 5 tutorial -8 - HDF

Collective vs. Independent Calls • MPI definition of collective call – All processes of

Collective vs. Independent Calls • MPI definition of collective call – All processes of the communicator must participate in the right order • Independent means not collective • Collective is not necessarily synchronous Intro PHDF 5 tutorial -9 - HDF

Programming Restrictions • Most PHDF 5 APIs are collective • PHDF 5 opens a

Programming Restrictions • Most PHDF 5 APIs are collective • PHDF 5 opens a parallel file with a communicator – Returns a file-handle – Future access to the file via the file-handle – All processes must participate in collective PHDF 5 APIs – Different files can be opened via different communicators Intro PHDF 5 tutorial - 10 - HDF

Examples of PHDF 5 API • Examples of PHDF 5 collective API – File

Examples of PHDF 5 API • Examples of PHDF 5 collective API – File operations: H 5 Fcreate, H 5 Fopen, H 5 Fclose – Objects creation: H 5 Dcreate, H 5 Dopen, H 5 Dclose – Objects structure: H 5 Dextend (increase dimension sizes) • Array data transfer can be collective or independent – Dataset operations: H 5 Dwrite, H 5 Dread Intro PHDF 5 tutorial - 11 - HDF

What Does PHDF 5 Support ? • After a file is opened by the

What Does PHDF 5 Support ? • After a file is opened by the processes of a communicator – – All parts of file are accessible by all processes All objects in the file are accessible by all processes Multiple processes write to the same data array Each process writes to individual data array Intro PHDF 5 tutorial - 12 - HDF

PHDF 5 API Languages • C and F 90 language interfaces • Platforms supported:

PHDF 5 API Languages • C and F 90 language interfaces • Platforms supported: – Most platforms with MPI-IO supported • IBM SP, Linux clusters, HP Alpha Clusters, SGI IRIX 64/Altrix, … – Work in progress • Red Storm(Cray xt 3), Blue. Gene/L, Intro PHDF 5 tutorial - 13 - HDF

Creating and Accessing a File Programming model • HDF 5 uses access template object

Creating and Accessing a File Programming model • HDF 5 uses access template object (property list) to control the file access mechanism • General model to access HDF 5 file in parallel: – – Setup MPI-IO access template (access property list) Open File Access Data Close File Intro PHDF 5 tutorial - 14 - HDF

Setup access template Each process of the MPI communicator creates an access template and

Setup access template Each process of the MPI communicator creates an access template and sets it up with MPI parallel access information C: herr_t H 5 Pset_fapl_mpio(hid_t plist_id, MPI_Comm comm, MPI_Info info); F 90: h 5 pset_fapl_mpio_f(plist_id, comm, info); integer(hid_t) : : plist_id integer : : comm, info plist_id is a file access property list identifier Intro PHDF 5 tutorial - 15 - HDF

C Example Parallel File Create 23 24 26 27 28 29 33 34 35

C Example Parallel File Create 23 24 26 27 28 29 33 34 35 36 37 38 42 49 50 51 52 54 comm = MPI_COMM_WORLD; info = MPI_INFO_NULL; /* * Initialize MPI */ MPI_Init(&argc, &argv); /* * Set up file access property list for MPI-IO access */ plist_id = H 5 Pcreate(H 5 P_FILE_ACCESS); H 5 Pset_fapl_mpio(plist_id, comm, info); file_id = H 5 Fcreate(H 5 FILE_NAME, H 5 F_ACC_TRUNC, H 5 P_DEFAULT, plist_id); /* * Close the file. */ H 5 Fclose(file_id); MPI_Finalize(); Intro PHDF 5 tutorial - 16 - HDF

F 90 Example Parallel File Create 23 24 26 29 30 32 34 35

F 90 Example Parallel File Create 23 24 26 29 30 32 34 35 37 38 40 41 43 45 46 49 51 52 54 56 comm = MPI_COMM_WORLD info = MPI_INFO_NULL CALL MPI_INIT(mpierror) ! ! Initialize FORTRAN predefined datatypes CALL h 5 open_f(error) ! ! Setup file access property list for MPI-IO access. CALL h 5 pcreate_f(H 5 P_FILE_ACCESS_F, plist_id, error) CALL h 5 pset_fapl_mpio_f(plist_id, comm, info, error) ! ! Create the file collectively. CALL h 5 fcreate_f(filename, H 5 F_ACC_TRUNC_F, file_id, error, access_prp = plist_id) ! ! Close the file. CALL h 5 fclose_f(file_id, error) ! ! Close FORTRAN interface CALL h 5 close_f(error) CALL MPI_FINALIZE(mpierror) Intro PHDF 5 tutorial - 17 - HDF

Creating and Opening Dataset • All processes of the communicator open/close a dataset by

Creating and Opening Dataset • All processes of the communicator open/close a dataset by a collective call – C: H 5 Dcreate or H 5 Dopen; H 5 Dclose – F 90: h 5 dcreate_f or h 5 dopen_f; h 5 dclose_f • All processes of the communicator must extend an unlimited dimension dataset before writing to it – C: H 5 Dextend – F 90: h 5 dextend_f Intro PHDF 5 tutorial - 18 - HDF

C Example Parallel Dataset Create 56 57 58 59 60 61 62 63 64

C Example Parallel Dataset Create 56 57 58 59 60 61 62 63 64 65 66 67 68 file_id = H 5 Fcreate(…); /* * Create the dataspace for the dataset. */ dimsf[0] = NX; dimsf[1] = NY; filespace = H 5 Screate_simple(RANK, dimsf, NULL); 70 71 72 73 74 H 5 Dclose(dset_id); /* * Close the file. */ H 5 Fclose(file_id); /* * Create the dataset with default properties collective. */ dset_id = H 5 Dcreate(file_id, “dataset 1”, H 5 T_NATIVE_INT, filespace, H 5 P_DEFAULT); Intro PHDF 5 tutorial - 19 - HDF

F 90 Example Parallel Dataset Create 43 CALL h 5 fcreate_f(filename, H 5 F_ACC_TRUNC_F,

F 90 Example Parallel Dataset Create 43 CALL h 5 fcreate_f(filename, H 5 F_ACC_TRUNC_F, file_id, error, access_prp = plist_id) 73 CALL h 5 screate_simple_f(rank, dimsf, filespace, error) 76 ! 77 ! Create the dataset with default properties. 78 ! 79 CALL h 5 dcreate_f(file_id, “dataset 1”, H 5 T_NATIVE_INTEGER, filespace, dset_id, error) 90 91 92 93 94 95 ! ! Close the dataset. CALL h 5 dclose_f(dset_id, error) ! ! Close the file. CALL h 5 fclose_f(file_id, error) Intro PHDF 5 tutorial - 20 - HDF

Accessing a Dataset • All processes that have opened dataset may do collective I/O

Accessing a Dataset • All processes that have opened dataset may do collective I/O • Each process may do independent and arbitrary number of data I/O access calls – C: H 5 Dwrite and H 5 Dread – F 90: h 5 dwrite_f and h 5 dread_f Intro PHDF 5 tutorial - 21 - HDF

Accessing a Dataset Programming model • Create and set dataset transfer property – C:

Accessing a Dataset Programming model • Create and set dataset transfer property – C: H 5 Pset_dxpl_mpio – H 5 FD_MPIO_COLLECTIVE – H 5 FD_MPIO_INDEPENDENT (default) – F 90: h 5 pset_dxpl_mpio_f – H 5 FD_MPIO_COLLECTIVE_F – H 5 FD_MPIO_INDEPENDENT_F (default) • Access dataset with the defined transfer property Intro PHDF 5 tutorial - 22 - HDF

C Example: Collective write 95 /* 96 * Create property list for collective dataset

C Example: Collective write 95 /* 96 * Create property list for collective dataset write. 97 */ 98 plist_id = H 5 Pcreate(H 5 P_DATASET_XFER); 99 H 5 Pset_dxpl_mpio(plist_id, H 5 FD_MPIO_COLLECTIVE); 100 101 status = H 5 Dwrite(dset_id, H 5 T_NATIVE_INT, 102 memspace, filespace, plist_id, data); Intro PHDF 5 tutorial - 23 - HDF

F 90 Example: Collective write 88 89 90 91 92 93 94 95 96

F 90 Example: Collective write 88 89 90 91 92 93 94 95 96 ! Create property list for collective dataset write ! CALL h 5 pcreate_f(H 5 P_DATASET_XFER_F, plist_id, error) CALL h 5 pset_dxpl_mpio_f(plist_id, & H 5 FD_MPIO_COLLECTIVE_F, error) ! ! Write the dataset collectively. ! CALL h 5 dwrite_f(dset_id, H 5 T_NATIVE_INTEGER, data, & error, & file_space_id = filespace, & mem_space_id = memspace, & xfer_prp = plist_id) Intro PHDF 5 tutorial - 24 - HDF

Writing and Reading Hyperslabs Programming model • Distributed memory model: data is split among

Writing and Reading Hyperslabs Programming model • Distributed memory model: data is split among processes • PHDF 5 uses hyperslab model • Each process defines memory and file hyperslabs • Each process executes partial write/read call – Collective calls – Independent calls Intro PHDF 5 tutorial - 25 - HDF

Hyperslab Example 1 Writing dataset by rows P 0 P 1 File P 2

Hyperslab Example 1 Writing dataset by rows P 0 P 1 File P 2 P 3 Intro PHDF 5 tutorial - 26 - HDF

Writing by rows Output of h 5 dump utility HDF 5 "SDS_row. h 5"

Writing by rows Output of h 5 dump utility HDF 5 "SDS_row. h 5" { GROUP "/" { DATASET "Int. Array" { DATATYPE H 5 T_STD_I 32 BE DATASPACE SIMPLE { ( 8, 5 ) / ( 8, 5 ) } DATA { 10, 10, 10, 11, 11, 11, 12, 12, 12, 13, 13, 13, 13 } } Intro PHDF 5 tutorial - 27 - HDF

Example 1 Writing dataset by rows File P 1 (memory space) offset[1] count[0] offset[0]

Example 1 Writing dataset by rows File P 1 (memory space) offset[1] count[0] offset[0] count[0] = dimsf[0]/mpi_size count[1] = dimsf[1]; offset[0] = mpi_rank * count[0]; offset[1] = 0; Intro PHDF 5 tutorial - 28 - /* = 2 */ HDF

C Example 1 71 /* 72 * Each process defines dataset in memory and

C Example 1 71 /* 72 * Each process defines dataset in memory and * writes it to the hyperslab 73 * in the file. 74 */ 75 count[0] = dimsf[0]/mpi_size; 76 count[1] = dimsf[1]; 77 offset[0] = mpi_rank * count[0]; 78 offset[1] = 0; 79 memspace = H 5 Screate_simple(RANK, count, NULL); 80 81 /* 82 * Select hyperslab in the file. 83 */ 84 filespace = H 5 Dget_space(dset_id); 85 H 5 Sselect_hyperslab(filespace, H 5 S_SELECT_SET, offset, NULL, count, NULL); Intro PHDF 5 tutorial - 29 - HDF

Hyperslab Example 2 Writing dataset by columns P 0 File P 1 Intro PHDF

Hyperslab Example 2 Writing dataset by columns P 0 File P 1 Intro PHDF 5 tutorial - 30 - HDF

Writing by columns Output of h 5 dump utility HDF 5 "SDS_col. h 5"

Writing by columns Output of h 5 dump utility HDF 5 "SDS_col. h 5" { GROUP "/" { DATASET "Int. Array" { DATATYPE H 5 T_STD_I 32 BE DATASPACE SIMPLE { ( 8, 6 ) / ( 8, 6 ) } DATA { 1, 2, 10, 20, 100, 200, 1, 2, 10, 20, 100, 200, 1, 2, 10, 20, 100, 200 } } Intro PHDF 5 tutorial - 31 - HDF

Example 2 Writing Dataset by Column Memory P 0 offset[1] block[0] P 0 dimsm[0]

Example 2 Writing Dataset by Column Memory P 0 offset[1] block[0] P 0 dimsm[0] dimsm[1] File P 1 offset[1] block[1] stride[1] P 1 Intro PHDF 5 tutorial - 32 - HDF

C Example 2 85 86 88 89 90 91 92 93 94 95 96

C Example 2 85 86 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 Intro PHDF 5 tutorial /* * Each process defines hyperslab in * the file */ count[0] = 1; count[1] = dimsm[1]; offset[0] = 0; offset[1] = mpi_rank; stride[0] = 1; stride[1] = 2; block[0] = dimsf[0]; block[1] = 1; /* * Each process selects hyperslab. */ filespace = H 5 Dget_space(dset_id); H 5 Sselect_hyperslab(filespace, H 5 S_SELECT_SET, offset, stride, count, block); - 33 - HDF

Hyperslab Example 3 Writing dataset by pattern P 0 File P 1 P 2

Hyperslab Example 3 Writing dataset by pattern P 0 File P 1 P 2 P 3 Intro PHDF 5 tutorial - 34 - HDF

Writing by Pattern Output of h 5 dump utility HDF 5 "SDS_pat. h 5"

Writing by Pattern Output of h 5 dump utility HDF 5 "SDS_pat. h 5" { GROUP "/" { DATASET "Int. Array" { DATATYPE H 5 T_STD_I 32 BE DATASPACE SIMPLE { ( 8, 4 ) / ( 8, 4 ) } DATA { 1, 3, 2, 4, 1, 3, 2, 4, 2, 4 } } Intro PHDF 5 tutorial - 35 - HDF

Example 3 Writing dataset by pattern Memory File stride[1] P 2 stride[0] count[1] offset[0]

Example 3 Writing dataset by pattern Memory File stride[1] P 2 stride[0] count[1] offset[0] = 0; offset[1] = 1; count[0] = 4; count[1] = 2; stride[0] = 2; stride[1] = 2; Intro PHDF 5 tutorial offset[1] - 36 - HDF

C Example 3: Writing by pattern 90 91 92 93 94 95 96 97

C Example 3: Writing by pattern 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 /* Each process defines dataset in memory and * writes it to the hyperslab * in the file. */ count[0] = 4; count[1] = 2; stride[0] = 2; stride[1] = 2; if(mpi_rank == 0) { offset[0] = 0; offset[1] = 0; } if(mpi_rank == 1) { offset[0] = 1; offset[1] = 0; } if(mpi_rank == 2) { offset[0] = 0; offset[1] = 1; } if(mpi_rank == 3) { offset[0] = 1; offset[1] = 1; } Intro PHDF 5 tutorial - 37 - HDF

Hyperslab Example 4 Writing dataset by chunks P 0 P 1 Intro PHDF 5

Hyperslab Example 4 Writing dataset by chunks P 0 P 1 Intro PHDF 5 tutorial P 2 P 3 - 38 - File HDF

Writing by Chunks Output of h 5 dump utility HDF 5 "SDS_chnk. h 5"

Writing by Chunks Output of h 5 dump utility HDF 5 "SDS_chnk. h 5" { GROUP "/" { DATASET "Int. Array" { DATATYPE H 5 T_STD_I 32 BE DATASPACE SIMPLE { ( 8, 4 ) / ( 8, 4 ) } DATA { 1, 1, 2, 2, 3, 3, 4, 4, 3, 3, 4, 4 } } Intro PHDF 5 tutorial - 39 - HDF

Example 4 Writing dataset by chunks File Memory P 2 offset[1] chunk_dims[1] offset[0] chunk_dims[0]

Example 4 Writing dataset by chunks File Memory P 2 offset[1] chunk_dims[1] offset[0] chunk_dims[0] block[0] = chunk_dims[0]; block[1] = chunk_dims[1]; offset[0] = chunk_dims[0]; offset[1] = 0; Intro PHDF 5 tutorial block[1] - 40 - HDF

C Example 4 Writing by chunks 97 98 99 100 101 102 103 104

C Example 4 Writing by chunks 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 Intro PHDF 5 tutorial count[0] = 1; count[1] = 1 ; stride[0] = 1; stride[1] = 1; block[0] = chunk_dims[0]; block[1] = chunk_dims[1]; if(mpi_rank == 0) { offset[0] = 0; offset[1] = 0; } if(mpi_rank == 1) { offset[0] = 0; offset[1] = chunk_dims[1]; } if(mpi_rank == 2) { offset[0] = chunk_dims[0]; offset[1] = 0; } if(mpi_rank == 3) { offset[0] = chunk_dims[0]; offset[1] = chunk_dims[1]; } - 41 - HDF

Useful Parallel HDF Links • Parallel HDF information site – http: //hdf. ncsa. uiuc.

Useful Parallel HDF Links • Parallel HDF information site – http: //hdf. ncsa. uiuc. edu/Parallel_HDF/ • Parallel HDF mailing list – hdfparallel@ncsa. uiuc. edu • Parallel HDF 5 tutorial available at – http: //hdf. ncsa. uiuc. edu/HDF 5/doc/Tutor Intro PHDF 5 tutorial - 42 - HDF

Thank you This presentation is based upon work supported in part by Teragrid. Other

Thank you This presentation is based upon work supported in part by Teragrid. Other support provided by NCSA and other sponsors and agencies (http: //hdf. ncsa. uiuc. edu/acknowledge. html).