The HDF Group HDF 5 Datasets and IO

  • Slides: 56
Download presentation
The HDF Group HDF 5 Datasets and I/O Dataset storage and its effect on

The HDF Group HDF 5 Datasets and I/O Dataset storage and its effect on performance May 30 -31, 2012 HDF 5 Workshop at PSI 1 www. hdfgroup. org

Outline • Dataset metadata and array data storage layouts • Types of dataset storage

Outline • Dataset metadata and array data storage layouts • Types of dataset storage layouts • Factors affecting I/O performance • • I/O with compact datasets I/O with contiguous datasets I/O with chunked datasets Variable length data and I/O May 30 -31, 2012 HDF 5 Workshop at PSI 2 www. hdfgroup. org

HDF 5 Layers HDF 5 Application buffer HDF 5 Object Layer (API) H 5

HDF 5 Layers HDF 5 Application buffer HDF 5 Object Layer (API) H 5 Dwrite is called HDF 5 Internals VFD Layer Data is prepared for I/O SEC 2 driver performs I/O HDF 5 file May 30 -31, 2012 HDF 5 Workshop at PSI 3 www. hdfgroup. org

Goal of this talk • Present what is happening to data inside the HDF

Goal of this talk • Present what is happening to data inside the HDF 5 library • Show application can control the HDF 5 library behavior • Specifically: - Describe some basic operations and data structures and explain how they affect performance and storage sizes - Give some “recipes” for how to improve performance May 30 -31, 2012 HDF 5 Workshop at PSI 4 www. hdfgroup. org

HDF 5 DATASET METADATA May 30 -31, 2012 HDF 5 Workshop at PSI 5

HDF 5 DATASET METADATA May 30 -31, 2012 HDF 5 Workshop at PSI 5 www. hdfgroup. org

HDF 5 Dataset • Data array • Also called raw data • Metadata -

HDF 5 Dataset • Data array • Also called raw data • Metadata - Dataspace - Rank, dimensions of dataset array - Datatype - Information on how to interpret data - Storage Properties - How array is organized on disk - Attributes - User-defined metadata (optional) May 30 -31, 2012 HDF 5 Workshop at PSI 6 www. hdfgroup. org

HDF 5 dataset components Dataset header Dataset data array Dataspace Rank Dimensions 3 Dim_1

HDF 5 dataset components Dataset header Dataset data array Dataspace Rank Dimensions 3 Dim_1 = 4 Dim_2 = 5 Dim_3 = 7 Datatype IEEE 32 -bit float Storage info Attributes Time = 32. 4 Chunked Pressure = 987 Compressed Temp = 56 Metadata May 30 -31, 2012 Raw data HDF 5 Workshop at PSI 7 www. hdfgroup. org

HDF 5 metadata • Information about HDF 5 objects used by the HDF 5

HDF 5 metadata • Information about HDF 5 objects used by the HDF 5 library • Examples: object headers, B-tree nodes for group, B-Tree nodes for chunks, heaps, superblock, etc. • Usually small compared to raw data sizes (KB vs. MB-GB) May 30 -31, 2012 HDF 5 Workshop at PSI 8 www. hdfgroup. org

HDF 5 metadata cache Metadata cache (MDC) Dataset array data Dataset header Application memory

HDF 5 metadata cache Metadata cache (MDC) Dataset array data Dataset header Application memory Dataset header resides in MDC is handled by HDF 5 library HDF 5 metadata Dataset array data HDF 5 File Metadata is mixed with raw data in HDF 5 file May 30 -31, 2012 HDF 5 Workshop at PSI 9 www. hdfgroup. org

HDF 5 metadata cache • Metadata cache • Space allocated to handle pieces of

HDF 5 metadata cache • Metadata cache • Space allocated to handle pieces of the HDF 5 metadata • Allocated by the HDF 5 library in application’s memory space • Allocated per file; released when file is closed • Metadata cache behavior affects overall performance • Metadata cache implementation prior to HDF 5 1. 6. 5 could cause performance degradation for some applications May 30 -31, 2012 HDF 5 Workshop at PSI 10 www. hdfgroup. org

HDF 5 DATASET STORAGE LAYOUTS May 30 -31, 2012 HDF 5 Workshop at PSI

HDF 5 DATASET STORAGE LAYOUTS May 30 -31, 2012 HDF 5 Workshop at PSI 11 www. hdfgroup. org

HDF 5 datasets storage layouts • • Contiguous External Chunked Compact May 30 -31,

HDF 5 datasets storage layouts • • Contiguous External Chunked Compact May 30 -31, 2012 HDF 5 Workshop at PSI 12 www. hdfgroup. org

Contiguous storage layout • Contiguous storage layout is a default storage layout for an

Contiguous storage layout • Contiguous storage layout is a default storage layout for an HDF 5 dataset • Dataset raw data is stored in one contiguous block in HDF 5 file May 30 -31, 2012 HDF 5 Workshop at PSI 13 www. hdfgroup. org

Contiguous storage layout Metadata cache (MDC) Dataset array data Dataset header Application memory Dataset

Contiguous storage layout Metadata cache (MDC) Dataset array data Dataset header Application memory Dataset array data HDF 5 File Dataset header Raw data is stored in one contiguous block in HDF 5 file May 30 -31, 2012 HDF 5 Workshop at PSI 14 www. hdfgroup. org

External storage layout • Dataset raw data is stored in an external file(s) that

External storage layout • Dataset raw data is stored in an external file(s) that should be kept together with the HDF 5 file • Layout in the external file is specified by an application • An easy way to make legacy data available to HDF 5 library May 30 -31, 2012 HDF 5 Workshop at PSI 15 www. hdfgroup. org

External storage layout Application memory Metadata cache (MDC) Dataset array data Dataset header Unix/Windows

External storage layout Application memory Metadata cache (MDC) Dataset array data Dataset header Unix/Windows file HDF 5 file Dataset header Metadata is stored in HDF 5 file. Raw data is stored in a separate file as specified by application May 30 -31, 2012 HDF 5 Workshop at PSI 16 www. hdfgroup. org

Chunked storage layout • Chunking – storage layout where a dataset is partitioned in

Chunked storage layout • Chunking – storage layout where a dataset is partitioned in fixed-size multi-dimensional tiles or chunks • Each chunk is stored as contiguous block • HDF 5 library treats each chunk as atomic object for I/O • Greatly affects performance and file sizes • Use for extendible datasets and datasets with filters applied (checksum, compression) • Use for sub-setting of big datasets May 30 -31, 2012 HDF 5 Workshop at PSI 17 www. hdfgroup. org

Chunked storage layout Metadata cache (MDC) Dataset array data B A C D Dataset

Chunked storage layout Metadata cache (MDC) Dataset array data B A C D Dataset header Chunk index Application memory HDF 5 File C Dataset header D B Chunk index A Raw data is stored in separate chunks in HDF 5 file May 30 -31, 2012 HDF 5 Workshop at PSI 18 www. hdfgroup. org

Compact storage layout • Raw data is stored in a dataset object header •

Compact storage layout • Raw data is stored in a dataset object header • Raw data read/written with the header • Use for small (few K) datasets to minimize small I/O operations May 30 -31, 2012 HDF 5 Workshop at PSI 19 www. hdfgroup. org

Compact storage layout Metadata cache (MDC) Dataset array data Dataset header Application memory HDF

Compact storage layout Metadata cache (MDC) Dataset array data Dataset header Application memory HDF 5 File Dataset header Dataset array data Raw data is stored in a dataset object header May 30 -31, 2012 HDF 5 Workshop at PSI 20 www. hdfgroup. org

FACTORS AFFECTING I/O PERFORMANCE May 30 -31, 2012 HDF 5 Workshop at PSI 21

FACTORS AFFECTING I/O PERFORMANCE May 30 -31, 2012 HDF 5 Workshop at PSI 21 www. hdfgroup. org

HDF 5 data structures • Data structures used by HDF 5 library • B-trees

HDF 5 data structures • Data structures used by HDF 5 library • B-trees (groups, dataset chunks) • Hash tables • Local and global heaps (variable length data: link names, strings, etc. ) • Other concepts • • HDF 5 metadata cache HDF 5 chunk cache Free space management data structure Etc. May 30 -31, 2012 HDF 5 Workshop at PSI 22 www. hdfgroup. org

Operations on data inside HDF 5 library • Copying to/from internal buffers • Datatype

Operations on data inside HDF 5 library • Copying to/from internal buffers • Datatype conversion, e. g. , • • Float to integer Little-endian to big-endian 64 -bit integer to 16 -bit integer Variable-length data conversion from memory to file • Scattering - gathering • Data is scattered/gathered from/to application buffers into internal buffers for datatype conversion and partial I/O May 30 -31, 2012 HDF 5 Workshop at PSI 23 www. hdfgroup. org

Operations on data inside HDF 5 library • Data transformation (filters, compression) - Checksum

Operations on data inside HDF 5 library • Data transformation (filters, compression) - Checksum on raw data and metadata Algebraic transform GZIP and SZIP compressions HDF 5 and user-defined data transformations May 30 -31, 2012 HDF 5 Workshop at PSI 24 www. hdfgroup. org

I/O performance • I/O performance depends on many factors • • Storage layouts Dataset

I/O performance • I/O performance depends on many factors • • Storage layouts Dataset storage properties Chunking strategy Metadata cache performance Datatype conversion performance Other filters, such as compression Access patterns May 30 -31, 2012 HDF 5 Workshop at PSI 25 www. hdfgroup. org

I/O WITH DIFFERENT STORAGE LAYOUTS May 30 -31, 2012 HDF 5 Workshop at PSI

I/O WITH DIFFERENT STORAGE LAYOUTS May 30 -31, 2012 HDF 5 Workshop at PSI 26 www. hdfgroup. org

WRITING COMPACT DATASET May 30 -31, 2012 HDF 5 Workshop at PSI 27 www.

WRITING COMPACT DATASET May 30 -31, 2012 HDF 5 Workshop at PSI 27 www. hdfgroup. org

Writing compact dataset Metadata cache (MDC) Dataset array data Dataset header Application memory HDF

Writing compact dataset Metadata cache (MDC) Dataset array data Dataset header Application memory HDF 5 File Dataset header Raw data is written when object header is written May 30 -31, 2012 HDF 5 Workshop at PSI 28 www. hdfgroup. org

WRITING CONTIGUOUS DATASET May 30 -31, 2012 HDF 5 Workshop at PSI 29 www.

WRITING CONTIGUOUS DATASET May 30 -31, 2012 HDF 5 Workshop at PSI 29 www. hdfgroup. org

Writing contiguous dataset Metadata cache (MDC) Dataset array data Dataset header Application memory Dataset

Writing contiguous dataset Metadata cache (MDC) Dataset array data Dataset header Application memory Dataset array data HDF 5 File Dataset header Raw data is written first. The header is written when flushed to file (H 5 Dclose, H 5 Fflush, or MDC flush done by the HDF 5 library) May 30 -31, 2012 HDF 5 Workshop at PSI 30 www. hdfgroup. org

Writing contiguous dataset with conversion Metadata cache (MDC) Dataset header Dataset array data 1

Writing contiguous dataset with conversion Metadata cache (MDC) Dataset header Dataset array data 1 MB conversion buffer Application memory HDF 5 File Dataset header Raw data goes through conversion buffer. The header is written when flushed to file (H 5 Dclose, H 5 Fflush, or MDC flush done by HDF 5 library) May 30 -31, 2012 HDF 5 Workshop at PSI 31 www. hdfgroup. org

PARTIAL I/O FOR CONTIGUOUS DATASET May 30 -31, 2012 HDF 5 Workshop at PSI

PARTIAL I/O FOR CONTIGUOUS DATASET May 30 -31, 2012 HDF 5 Workshop at PSI 32 www. hdfgroup. org

Sub-setting of contiguous dataset Series of adjacent rows Application data in memory M rows

Sub-setting of contiguous dataset Series of adjacent rows Application data in memory M rows N One I/O operation M rows HDF 5 File N elements May 30 -31, 2012 Subset is contiguous in file HDF 5 Workshop at PSI 33 www. hdfgroup. org

Sub-setting of contiguous dataset Adjacent, partial rows Application data in memory N elements M

Sub-setting of contiguous dataset Adjacent, partial rows Application data in memory N elements M rows Several I/O operation M rows HDF 5 File N elements May 30 -31, 2012 Subset is in M contiguous blocks in file HDF 5 Workshop at PSI 34 www. hdfgroup. org

Sub-setting of contiguous dataset Extreme case: writing a column Application data in memory M

Sub-setting of contiguous dataset Extreme case: writing a column Application data in memory M rows Several small I/O operation 1 element 1 element HDF 5 File Subset data is scattered in a file in M different locations May 30 -31, 2012 HDF 5 Workshop at PSI 35 www. hdfgroup. org

Sub-setting of contiguous dataset Data sieve buffer Application data in memory Data is copied

Sub-setting of contiguous dataset Data sieve buffer Application data in memory Data is copied to a sieve buffer in memory (64 K) memcopy M One write operation 1 element … HDF 5 File May 30 -31, 2012 HDF 5 Workshop at PSI 36 www. hdfgroup. org

Performance tuning for contiguous dataset • Datatype conversion • Avoid for better performance •

Performance tuning for contiguous dataset • Datatype conversion • Avoid for better performance • Use H 5 Pset_buffer function to customize conversion buffer size • Partial I/O • Write/read in big contiguous blocks • Use H 5 Pset_sieve_buf_size to improve performance for complex sub-setting • Caution: • Sieve buffer is allocated when the first write occurs and is released when the dataset is closed. • Memory will grow if there a lot opened datasets. May 30 -31, 2012 HDF 5 Workshop at PSI 37 www. hdfgroup. org

I/O FOR CHUNKED DATASET May 30 -31, 2012 HDF 5 Workshop at PSI 38

I/O FOR CHUNKED DATASET May 30 -31, 2012 HDF 5 Workshop at PSI 38 www. hdfgroup. org

Recall: Chunked storage layout Metadata cache (MDC) Dataset array data B A C D

Recall: Chunked storage layout Metadata cache (MDC) Dataset array data B A C D Dataset header Chunk index Application memory HDF 5 File C Dataset header D B Chunk index A Raw data is stored in separate chunks in HDF 5 file May 30 -31, 2012 HDF 5 Workshop at PSI 39 www. hdfgroup. org

HDF 5 chunking • HDF 5 library treats each chunk as atomic object •

HDF 5 chunking • HDF 5 library treats each chunk as atomic object • Compression is applied to each chunk • Datatype conversion, other filters applied per chunk • Chunk size greatly affects performance • Chunk overhead adds to file size • Chunk processing involves many steps May 30 -31, 2012 HDF 5 Workshop at PSI 40 www. hdfgroup. org

HDF 5 chunk cache • Chunk cache (general points, details later) • Caches chunks

HDF 5 chunk cache • Chunk cache (general points, details later) • Caches chunks for better performance; remains allocated across multiple calls • Created for each chunked dataset • Size of chunk cache is set for file (default size 1 MB) • Each chunked dataset has its own chunk cache • Chunk may be too big to fit into cache • Memory may grow if application keeps opening datasets May 30 -31, 2012 HDF 5 Workshop at PSI 41 www. hdfgroup. org

HDF 5 chunk cache Metadata cache (MDC) Dataset. Metadata header cache Chunking B-tree nodes

HDF 5 chunk cache Metadata cache (MDC) Dataset. Metadata header cache Chunking B-tree nodes Default size is 1 MB Chunk caches (per dataset) Application memory May 30 -31, 2012 HDF 5 Workshop at PSI 42 www. hdfgroup. org

Writing chunked dataset Application memory space Chunked dataset Chunk cache Conversion buffer A C

Writing chunked dataset Application memory space Chunked dataset Chunk cache Conversion buffer A C B C Filter pipeline HDF 5 File B A C Datatype conversion is performed before chunked placed in cache Chunk is written when evicted from cache Compression and other filters are applied on eviction May 30 -31, 2012 HDF 5 Workshop at PSI 43 www. hdfgroup. org

PARTIAL I/O FOR CHUNKED DATASET May 30 -31, 2012 HDF 5 Workshop at PSI

PARTIAL I/O FOR CHUNKED DATASET May 30 -31, 2012 HDF 5 Workshop at PSI 44 www. hdfgroup. org

Partial I/O for chunked dataset 1 2 3 4 • Example: write the green

Partial I/O for chunked dataset 1 2 3 4 • Example: write the green subset from the dataset , converting the data • Dataset is stored as six chunks in the file. • The subset spans four chunks, numbered 1 -4 in the figure. • Hence four chunks must be written to the file. • But first, the four chunks must be read from the file, to preserve those parts of each chunk that are not to be overwritten. May 30 -31, 2012 HDF 5 Workshop at PSI 45 www. hdfgroup. org

Partial I/O for chunked dataset • For each of the four chunks: • Read

Partial I/O for chunked dataset • For each of the four chunks: • Read chunk from file into chunk cache, unless it’s already there. • Determine which part of the chunk will be replaced by the selection. • Move those elements to conversion buffer and perform conversion • Move data elements to write from application buffer to conversion buffer • Move those elements back from conversion buffer to chunk cache. • Apply filters (compression) when chunk is flushed from chunk cache • For each element 3 memcopy performed May 30 -31, 2012 HDF 5 Workshop at PSI 46 www. hdfgroup. org

Partial I/O for chunked dataset Chunk cache memcopy Conversion buffer 3 memcopy Application memory

Partial I/O for chunked dataset Chunk cache memcopy Conversion buffer 3 memcopy Application memory Compress and write to file HDF 5 File May 30 -31, 2012 Chunk HDF 5 Workshop at PSI 47 www. hdfgroup. org

I/O FOR VARIABLE-LENGTH DATASET May 30 -31, 2012 HDF 5 Workshop at PSI 48

I/O FOR VARIABLE-LENGTH DATASET May 30 -31, 2012 HDF 5 Workshop at PSI 48 www. hdfgroup. org

Examples of variable length data • String A[0] “the first string we want to

Examples of variable length data • String A[0] “the first string we want to write” ………………… A[N-1] “the N-th string we want to write” • Each element is a record of variable-length A[0] (1, 1, 0, 0, 0, 5, 6, 7, 8, 9) [length = 10] A[1] (0, 0, 110, 2005) [length = 4] ……………. . A[N] (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, …. , M) [length = M] May 30 -31, 2012 HDF 5 Workshop at PSI 49 www. hdfgroup. org

Variable length data in HDF 5 • Variable length description in HDF 5 application

Variable length data in HDF 5 • Variable length description in HDF 5 application typedef struct { size_t length; void *p; }hvl_t; • Base type can be any HDF 5 type H 5 Tvlen_create(base_type) • ~ 20 bytes overhead for each element • Data cannot be compressed May 30 -31, 2012 HDF 5 Workshop at PSI 50 www. hdfgroup. org

How variable length data is stored in HDF 5 Actual variable length data Global

How variable length data is stored in HDF 5 Actual variable length data Global heap HDF 5 File Dataset header Dataset with variable length elements May 30 -31, 2012 HDF 5 Workshop at PSI Pointer into global heap 51 www. hdfgroup. org

Variable length datasets and I/O • Elements from application buffer “transferred” to/from heaps in

Variable length datasets and I/O • Elements from application buffer “transferred” to/from heaps in the metadata cache during I/O Application buffer Raw data Global heap Pointers Metadata cache May 30 -31, 2012 HDF 5 Workshop at PSI 52 www. hdfgroup. org

There may be more than one global heap Raw data Application buffer Global heap

There may be more than one global heap Raw data Application buffer Global heap Pointers Global heap May 30 -31, 2012 HDF 5 Workshop at PSI 53 www. hdfgroup. org

VL dataset and I/O Conversion buffers Application buffer Global heap Memory HDF 5 File

VL dataset and I/O Conversion buffers Application buffer Global heap Memory HDF 5 File May 30 -31, 2012 HDF 5 Workshop at PSI 54 www. hdfgroup. org

Hints for variable length data I/O • Avoid closing/opening a file while writing VL

Hints for variable length data I/O • Avoid closing/opening a file while writing VL datasets • Global heap information is lost • Global heaps may have unused space • Avoid alternately writing different VL datasets • Data from different datasets will go into to the same heap • If maximum length of the record is known, consider using fixed-length records and compression May 30 -31, 2012 HDF 5 Workshop at PSI 55 www. hdfgroup. org

The HDF Group Thank You! Questions? May 30 -31, 2012 HDF 5 Workshop at

The HDF Group Thank You! Questions? May 30 -31, 2012 HDF 5 Workshop at PSI 56 www. hdfgroup. org