Profile of HDFEOS 5 Files Abe Taaheri Raytheon
Profile of HDF-EOS 5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008
General HDF-EOS 5 File Structure • HDF EOS 5 file: any valid HDF 5 file that contains a family of global attributes called: coremetadata. X Optional data objects: § family of global attributes called: archivemetadata. X § any number of Swath, Grid, Point, ZA, and Profile data structures. « another family of global attributes: Struct. Metadata. X Page 2
General HDF-EOS 5 File Structure • Global Attributes provide: Info on the structure of HDF EOS 5 file Info on the data granule that file contains • Other optional user added global attributes: “PGEVersion”, “Orbit. Number”, etc. are written as HDF 5 attributes into a group called “FILE ATTRIBUTES” Page 3
General HDF-EOS 5 File Structure • coremetadata. X Used to populate searchable database tables within the ECS archives. Data users use this information to locate particular HDF EOS 5 data granules. • archivemetadata. X Not searchable. Contains whatever information the file creator considers useful to be in the file, but which will not be directly accessible by ECS databases. S • Struct. Metadata. X Describes contents and structure of HDF EOS file. e. g. dimensions, compression methods, geolocation, projection information, etc. that are associated with the data itself. Page 4
General HDF-EOS 5 File Structure • An HDF EOS 5 file – can contain any number of Grid, Point, Swath, Zonal Average, and Profile data structures – has no size limits. § A file containing 1000's of objects could cause program execution slow downs – can be hybrid, containing plain HDF 5 objects for special purposes. § HDF 5 objects must be accessed by the HDF 5 library and not by HDF EOS 5 extensions. § will require more knowledge of file contents on the part of an applications developer or data user. Page 5
Swath Structure • Data which is organized by time, or other track parameter. • Spacing can be irregular. • Structure – Geolocation information stored explicitly in Geolocation Field (2 D array) – Data stored in 2 D or 3 D arrays – Time stored in 1 D or 2 D array, – Geolocation/science data connected by structural metadata Page 6
Swath Structure • For a typical satellite swath, an instrument takes a series of scans perpendicular to the ground track of the satellite as it moves along that ground track • Or a sensor measures a vertical profile, instead of scanning across the ground track Page 7
Swath Structure “SWATHS” group • Swath_X groups are created when swaths are created • Data/Geo fields’ parent group are created when fields are defined. • Swath attributes are set as Object Attributes. • Attributes for Data, Profile, or Gelocation Fields groups are set as Group Attributes • Dataset related attributes set for each data field or geolocation field are called Local Attributes. They may contain attributes such as fillvalue, units, etc. Object Attribute <Swath. Name>: <Attr. Name> “Swath_1” Group Attribute <Data. Fields>: <Attr. Name> Data Fields Local Attribute <Field. Name>: <Attr. Name> Data Field. 1 Data Field. n “Swath_N” Profile Fields Profile Field. 1 Profile Field. n Geolocation Fields Longitude Time Latitude Colatitude HDF 5 Group HDF 5 Attribute HDF 5 Dataset Each Data Field object can have Attributes and/or Dimension Scales Page 8
Swath Structure • Geolocation Fields − Geolocation fields allow the Swath to be accurately tied to particular points on the Earth’s surface. − At least a time field (“Time”) or a latitude/longitude field pair (“Latitude” and “Longitude”). “Colatitude” may be substituted for “Latitude. ” − Other geofields such as “Altitude” can be defined and mapped onto a data. Dim − Fields must be either one or two dimensional − The “Time” field is always in TAI format (International Atomic Time) Field Name Data Type Format Longitude float 32 or float 64 DD*, range [ 180. 0, 180. 0] Latitude float 32 or float 64 DD*, range [ 90. 0, 90. 0] Colatitude float 32 or float 64 DD*, range [0. 0, 180. 0] Time float 64 TAI 93 [seconds until( ) / since(+) midnight, 1/1/93] * DD = Decimal Degree Page 9
Swath Structure • Data Fields − Fields may have up to 8 dimensions. − For multi dimensional fields: The dimension representing the “along track” must precede the dimension representing the scan or profile (in C order). ( e. g. “Bands, Data. Track, Data. Xtrack” ) Page 10
Swath Structure − Compression is selectable at the field level. ▪ All HDF 5 supported compression methods are available through the HDF EOS 5 library ▪ The compression method is stored within the file. ▪ Subsequent use of the library will un compress the file. ▪ As in HDF 5 the data needs to be chunked before the compression is applied. − Field names: * may be up to 64 characters in length. * Any character can be used with the exception of, ", ", "; ", and "/". * are case sensitive. * must be unique within a particular Swath structure. Page 11
Compression Codes Compression Code HDFE_COMP_NONE Value Explanation 0 No Compression 1 Run Length Encoding Compression (not supported) HDFE_COMP_NBIT 2 NBIT Compression HDFE_COMP_SKPHUFF 3 Skipping Huffman (not supported) HDFE_COMP_DEFLATE 4 gzip Compression 5 szip Compression, Compression exactly as in hardware 6 szip Compression, allowing k split = 13 Compression 7 szip Compression, entropy coding method 8 szip Compression, nearest neighbor coding method 9 szip Compression, allowing k split = 13 Compression, or entropy coding method HDFE_COMP_RLE HDFE_COMP_SZIP_CHIP HDFE_COMP_SZIP_K 13 HDFE_COMP_SZIP_EC HDFE_COMP_SZIP_NN HDFE_COMP_SZIP_K 13 or. EC For Compression the data storage must be CHUNKED first Page 12
Compression Codes Compression Code Value HDFE_COMP_SZIP_K 13 or. NN HDFE_COMP_SHUF_DEFLATE 10 szip Compression, allowing k split = 13 Compression, or nearest neighbor coding method 11 shuffling + deflate(gzip) Compression 12 shuffling + Compression exactly as in hardware 13 shuffling + allowing k split = 13 Compression 14 shuffling + entropy coding method 15 shuffling + nearest neighbor coding method 16 shuffling + allowing k split = 13 Compression, or entropy coding method 17 shuffling + allowing k split = 13 Compression, or nearest neighbor coding method HDFE_COMP_SHUF_SZIP_CHIP HDFE_COMP_SHUF_SZIP_K 13 HDFE_COMP_SHUF_SZIP_EC Explanation HDFE_COMP_SHUF_SZIP_NN HDFE_COMP_SHUF_SZIP_K 13 or. EC HDFE_COMP_SHUF_SZIP_K 13 or. NN For Compression the data storage must be CHUNKED first Page 13
Swath Structure • Dimension maps: Glue that holds the SWATH together. Define the relationship between data fields and geolocation fields dimensions A “Normal” Dimension Map Can be normal or indexed mapping A “Backwards” Dimension Map Page 14
Grid Structure • Usage Data which is organized by regular geographic spacing, specified by projection parameters. • Structure – Any number of 2 D to 8 D data arrays per structure – Geolocation information contained in projection formula, coupled by structural metadata. – Any number of Grid structures per file allowed. Page 15
Grid Structure • A grid contains: grid corner locations a set of projection equations (or references to them) along with their relevant parameters. • The equations and parameters A Data Field in a Mercator-Projected Grid are used to compute the lon/lat for any point in the grid. • Important features of Grid data set: the data fields the dimensions the projection A Data Field in an Interrupted Goode’s Homolosine-Projected Grid Page 16
Grid Structure Data Field characteristics: − Fields may have up to 8 dims − Dim order in field definitions: C: “Band, YDim, XDim” Fortran: “XDim, YDim, Band” Compression is selectable at the field level within a Grid. Subsequent use of the library will un compress the file. Data needs to be tiled before the compression is applied. − Field names must be unique within a particular Grid structure and are case sensitive. They may be up to 64 characters in length. − Any character can be used with the exception of, ", ", "; ", " and "/". Page 17
Grid Structure Dimensions: • Two predefined dimensions for Data Fields: “XDim” and “YDim”. defined when the grid is created stored in the structure metadata. relate data fields to each other and to the geolocation information • Fields are Two - eight dimensional many fields will need not more than three: the predefined dimensions “XDim” and “YDim” and a third dimension for depth, height, or band. Page 18
Grid Structure • Projection: − Is the heart of the Grid structure. − Provides a convenient way to encode geolocation information as a set of mathematical equations, capable of transforming Earth coordinates (lat/long) to X Y coordinates on a sheet of paper − General Coordinate Transformation Package (GCTP) library contains all projection related conversions and calculations. − Supported projections: Geographic Mercator Transverse Mercator Cylindrical Equal area Hotin Oblique Mercator Sinusoidal* Integerized Sinusoidal Polar Stereographic Lambert Azimuthal Equal Area Polyconic Albers Conical Equal Area Universal Transverse Mercator Space Oblique Mercator Interrupted Goode’s Homolosine Lambert Conformal Conic * Sinusoidal is pseudocylinderical Page 19
Point Structure • Data is specified temporally and/or spatially, but with no particular organization • Structure – Tables used to store science data at a particular Lat/Long/Height – Up to eight levels of data allowed. Structural metadata specifies relationship between levels. Page 20
Point Structure • Made up of a series of data records taken at [possibly] irregular time intervals and at scattered geographic locations • Loosely organized form of geolocated data supported by HDF EOS • Level are linked by a common field name called Link. Field • Usually shared info is stored in Parent level, while data values stored in Child level • The values for the Link. Filed in the Parent level must be unique Page 21
Point Structure • Point structure groups are created when user creates “Point_1”, …. . • Data and Linkage groups are created automatically when the level is defined • The order in which the levels are defined determines the (0 based) level index • FWDPOINTER Linkage will not be set (acutally first one is set to ( 1, 1)) if the records in Child level is not monotonic in Link. Fiekd “POINTS” Group Object Attribute <Swath. Name>: <Attr. Name> “Point_1” Group Attribute <Swath. Name>: <Attr. Name> Data Local Attribute <Swath. Name>: <Attr. Name> Level 1 “Point_n” Linkag Level n FWD BCK POINTER HDF 5 Group • A level can contain any number of fields and records Level Data Page 22
Zonal Average (ZA) Structure • Generalized array structure with no geolocation linkage (basically a swath like structure without geolocation. ) • The interface is designed to support data that has not associated with specific geolocation information. • Data can be organized by time or track parameter • Data spacing can be irregular • Structure “ZAS” group Object Attribute <Swath. Name>: <Attr. Name> Group Attribute <Data. Fields>: <Attr. Name> Local Attribute <Field. Name>: <Attr. Name> “Za_1” “Za_n” Data Fields Data Field. n HDF 5 Group – Data stored in multidimensional arrays – Time stored in 1 D or 2 D array Page 23
“h 5 dump” output of a simple HDF-EOS 5 file HDF 5 "Grid. he 5" { GROUP "/" { GROUP "HDFEOS" { GROUP "ADDITIONAL" { GROUP "FILE_ATTRIBUTES" { } } GROUP "GRIDS" { GROUP "TMGrid" { GROUP "Data Fields" { DATASET "Voltage" { DATATYPE H 5 T_IEEE_F 32 BE DATASPACE SIMPLE { ( 5, 7 ) / ( 5, 7 ) } DATA { (0, 0): -1. 11111, (0, 5): -1. 11111, ………………. . (4, 0): -1. 11111, (4, 5): -1. 11111, -1. 11111 } Page 24
“h 5 dump” output of a simple HDF-EOS 5 file (cont. ) ATTRIBUTE "_Fill. Value" { DATATYPE H 5 T_IEEE_F 32 BE DATASPACE SIMPLE { ( 1 ) / ( 1 ) } DATA { (0): -1. 11111 } } } } GROUP "HDFEOS INFORMATION" { ATTRIBUTE "HDFEOSVersion" { DATATYPE H 5 T_STRING { STRSIZE 32; STRPAD H 5 T_STR_NULLTERM; CSET H 5 T_CSET_ASCII; CTYPE H 5 T_C_S 1; } Page 25
“h 5 dump” output of a simple HDF-EOS 5 file (cont. ) DATASPACE SCALAR DATA { (0): "HDFEOS_5. 1. 11" } } DATASET "Struct. Metadata. 0" { DATATYPE H 5 T_STRING { STRSIZE 32000; STRPAD H 5 T_STR_NULLTERM; CSET H 5 T_CSET_ASCII; CTYPE H 5 T_C_S 1; } DATASPACE SCALAR DATA { (0): "GROUP=Swath. Structure END_GROUP=Swath. Structure GROUP=Grid. Structure GROUP=GRID_1 Grid. Name="TMGrid" XDim=5 YDim=7 Page 26
“h 5 dump” output of a simple HDF-EOS 5 file (cont. ) Upper. Left. Point. Mtrs=(4855670. 775390, 9458558. 924830) Lower. Right. Mtrs=(5201746. 439830, -10466077. 249420) Projection=HE 5_GCTP_TM Proj. Params=(0, 0, 0. 999600, 0, -75000000, 0, 0) Sphere. Code=0 GROUP=Dimension OBJECT=Dimension_1 Dimension. Name="Time" Size=10 END_OBJECT=Dimension_1 OBJECT=Dimension_2 Dimension. Name="Unlim" Size=-1 END_OBJECT=Dimension_2 END_GROUP=Dimension Page 27
“h 5 dump” output of a simple HDF-EOS 5 file (cont. ) GROUP=Data. Field OBJECT=Data. Field_1 Data. Field. Name="Voltage" Data. Type=H 5 T_NATIVE_FLOAT Dim. List=("XDim", "YDim") Maxdim. List=("XDim", "YDim") END_OBJECT=Data. Field_1 END_GROUP=Data. Field GROUP=Merged. Fields END_GROUP=GRID_1 END_GROUP=Grid. Structure GROUP=Point. Structure END_GROUP=Point. Structure GROUP=Za. Structure END_GROUP=Za. Structure END " } } } Page 28
- Slides: 28