The HDF Group NASA HDFHDFEOS Data Access Challenges



















![Attribute Type Mismatch Example Int 16 data[180][360] // Variable String valid_range “ 0, 100” Attribute Type Mismatch Example Int 16 data[180][360] // Variable String valid_range “ 0, 100”](https://slidetodoc.com/presentation_image_h/dc32184d9113b93dd30997d2692d02b0/image-20.jpg)
















- Slides: 36
The HDF Group NASA HDF/HDF-EOS Data Access Challenges H. Joe Lee (hyokee@hdfgroup. org) Kent Yang (myang 6@hdfgroup. org) The HDF Group July 9, 2013 ESIP 2013 Summer Meeting 1 www. hdfgroup. org
Hal Varian, Google’s chief economist “The ability to take data – to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it – that’s going to be a hugely important skill in the next decades. ” July 9, 2013 ESIP 2013 Summer Meeting 2 www. hdfgroup. org
For Earth Science Data Users The ability to take NASA HDF/HDF-EOS data – to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it – that’s a hugely important skill right now. July 9, 2013 ESIP 2013 Summer Meeting 3 www. hdfgroup. org
Is it easy to take NASA HDF data? No, for Average Joe data user. July 9, 2013 ESIP 2013 Summer Meeting 4 www. hdfgroup. org
Understand “I'm new to IDL and HDF; and I'm currently working with MODIS L 1 B data. I found your examples very helpful. Is it possible to show radiance is calculated? ” July 9, 2013 ESIP 2013 Summer Meeting 5 www. hdfgroup. org
Process “I work in NASA/GSFC GES-DISC on AIRS project. We have new idl version 8. 1. But got a core dump error when we run EOS function swath name from a AIRS level 2 product file. Need your EOS_SW_INQSWATH to inqure help. Thanks. ” July 9, 2013 ESIP 2013 Summer Meeting 6 www. hdfgroup. org
Extract Values TRMM data, “Hi, I want to use the following http: //mirador. gsfc. nasa. gov/. . . 2 A 25. . . . Can you provide me some programs that deal with these datasets so that I can obtain the daily convective precipitation in the region 110 -180 E, 0 -40 N during 2006? ” July 9, 2013 ESIP 2013 Summer Meeting 7 www. hdfgroup. org
Visualize matlab file for reading ozone hdf 5 files obtained from mls available “Can you please make the to the public. I wanted to obtain ozone distribution over the world and ozone distributions with height etc. thank you : ) …. oh can you tell me which function can i use to plot latitude in the x-axis, pressure in the y-axis and a contour plot of ozone over it? ” July 9, 2013 ESIP 2013 Summer Meeting 8 www. hdfgroup. org
Communicate “Your prog is very helpful to verify my process. I have one more doubt. I am trying to convert this hdf to Geotiff using Matlab. Do have any written code to do the same. Doing it with HEG tool given an error specifying that 5 D are only supported for SOM projections. Also I am doing all processing with Matlab. So could you pl. help me. ” July 9, 2013 ESIP 2013 Summer Meeting 9 www. hdfgroup. org
NASA HDF Users See Challenges in accessing satellite-product-specific (MODIS, AIRS, MLS) geo-location/time-specific (lat/lon/height/year) their favorite software data with packages (MATLAB/IDL/Arc. GIS). July 9, 2013 ESIP 2013 Summer Meeting 10 www. hdfgroup. org
What Makes Access Challenging? 1. Some files use the techniques that end users may not be familiar with, although the techniques may help storing data efficiently. 2. Information from a source outside the files is required to retrieve the data in a physically meaningful manner. 3. Attributes do not comply with the widely used conventions. 4. Metadata in HDF file has incorrect information. July 9, 2013 ESIP 2013 Summer Meeting 11 www. hdfgroup. org
Converted File Size Comparison 656 M Netcdf-3 128 M Netcdf-4 72 M HDF-EOS 2 July 9, 2013 ESIP 2013 Summer Meeting 9 X 12 www. hdfgroup. org
Challenge 1: Unfamiliar Techniques Users look for Latitude/Longitude datasets that match variable (e. g. , Ozone) datasets. Some HDF products have • mismatched lat/lon. • lat/lon information in metadata attribute. • duplicate lat/lon information. July 9, 2013 ESIP 2013 Summer Meeting 13 www. hdfgroup. org
Swath Dimension Map Example HDF-EOS Swath Dimension Map allows to have mismatched size in dimensions. • Latitude[512] • Longitude[512] • Data[1024] July 9, 2013 ESIP 2013 Summer Meeting 14 www. hdfgroup. org
NSIDC AMSR_E NCL Example ; Read the file as HDF 4 file to obtain dataset attributes. hdf 4_file = addfile("AMSR_E_L 3_Weekly. Ocean_V 03_20020616. hdf", "r") ; Read the file as HDF-EO 2 file to obtain lat and lon. hdf-eos 2_file = addfile("AMSR_E_L 3_Weekly. Ocean_V 03_20020616. hdf. he 2" User should call both HDF 4 and HDF-EOS 2 API: • HDF 4 API alone cannot resolve lat/lon. • HDF-EOS 2 API alone cannot retrieve some attributes that are added later by HDF 4 APIs. July 9, 2013 ESIP 2013 Summer Meeting 15 www. hdfgroup. org
Challenge 2: Information Outside HDF Users must read data product manual to find • fill value / valid ranges • units or discrete key values • scale / offset equation • physical description of data Some products are not self-describing! July 9, 2013 ESIP 2013 Summer Meeting 16 www. hdfgroup. org
Without Information Outside HDF July 9, 2013 ESIP 2013 Summer Meeting 17 www. hdfgroup. org
With Information Outside HDF July 9, 2013 ESIP 2013 Summer Meeting 18 www. hdfgroup. org
Challenge 3: The CF Conventions Following the widely accepted CF conventions is important for interoperability but some HDF products • use non-alphanumeric characters. • use non-CF attribute names and values. • use non-CF scale / offset rules. • use different data type for attribute (e. g. , _Fill. Value) from the variable. July 9, 2013 ESIP 2013 Summer Meeting 19 www. hdfgroup. org
Attribute Type Mismatch Example Int 16 data[180][360] // Variable String valid_range “ 0, 100” // Attribute (Wrong) Byte _Fill. Value 255 // Attribute (Wrong) Int 16 data[180][360] // Variable Int 16 valid_range 0, 100 // Attribute (Correct) Int 16 _Fill. Value 255 // Attribute (Correct) July 9, 2013 ESIP 2013 Summer Meeting 20 www. hdfgroup. org
Challenge 4: Incorrect Information Sometimes, metadata contains incorrect information. This is rare and such information is usually corrected immediately by data producers. July 9, 2013 ESIP 2013 Summer Meeting 21 www. hdfgroup. org
Incorrect Information Example An NCL user reported that the same code doesn’t work for an older MOP 02 HDF-EOS 5 file. In 2008/01/01 file, Struct. Metadata has the wrong value: n. Time = 250841130416 In 2008/12/31 file, Struct. Metadata has the correct value: n. Time= 2 La. RC ASDC fixed this already! July 9, 2013 ESIP 2013 Summer Meeting 22 www. hdfgroup. org
Good News The recent effort from The HDF Group overcomes many challenges: • HDF 4/HDF 5 OPe. NDAP Handler with Enable. CF option • H 4 CF Conversion Toolkit with Nc. ML / NCO examples • HDF-EOS 5 Augmentation Tool • HDF-EOS 2 Dumper tool with Comprehensive Examples for MATLAB/IDL/NCL The above tools and their examples are available at HDFEOS. org. July 9, 2013 ESIP 2013 Summer Meeting 23 www. hdfgroup. org
Challenge 1: Unfamiliar Techniques HDF OPe. NDAP handlers & H 4 CF Conversion Toolkit • provide full geo-location information as explicit datasets. HDF-EOS 5 Augmentation Tool • provides ways to associate geo-location information with existing datasets or to supply new ones. HDF-EOS 2 Dumper Tool • prints out geo-location information in ASCII because MATLAB/IDL/NCL can read ASCII text data. July 9, 2013 ESIP 2013 Summer Meeting 24 www. hdfgroup. org
Challenge 2: Information Outside HDF OPe. NDAP handlers • provide fill value / valid range information. • apply CF scale / offset rule. • calculate latitude and longitude values for some NASA non-EOS products. • are tested against ncml_handler so that data centers can additional information using Nc. ML. H 4 CF Conversion Toolkit (h 4 tonccf) • provides Nc. ML and NCO examples to add or edit attributes for converted Net. CDF files. July 9, 2013 ESIP 2013 Summer Meeting 25 www. hdfgroup. org
Challenge 3: The CF Conventions HDF OPe. NDAP handlers & H 4 CF Conversion Toolkit • flatten group hierarchies. • change variable & attribute types, names, and values. • add named dimensions. • add coordinate information. July 9, 2013 ESIP 2013 Summer Meeting 26 www. hdfgroup. org
Challenge 4: Incorrect Information HDF OPe. NDAP handlers & H 4 CF Conversion Toolkit • correct errors for old products temporarily. • catch errors for new products. July 9, 2013 ESIP 2013 Summer Meeting 27 www. hdfgroup. org
Better News We see less and less challenges in newer HDF products thanks to open communication and standardization effort among Earth Science communities through meetings, telecons, and mailing lists. • HDF – DAACs Telecons • ESDSWG – H 5 CF Conventions • ESIP • CF (satellite) conventions mailing lists July 9, 2013 ESIP 2013 Summer Meeting 28 www. hdfgroup. org
Future Challenges • Data Discovery • Subsetting and Aggregation • Sharing Research Data July 9, 2013 ESIP 2013 Summer Meeting 29 www. hdfgroup. org
Data Discovery Some users still don’t know how to search and where to download data. Spatial search in Reverb doesn’t guarantee that the matched HDF data files contain the valid values at the specific location that user is looking for. Browse image is helpful but users don’t want to examine one by one. July 9, 2013 ESIP 2013 Summer Meeting 30 www. hdfgroup. org
Reverb Browse Image for O 3 at Seoul The returned HDF file has no value at Seoul July 9, 2013 ESIP 2013 Summer Meeting 31 www. hdfgroup. org
Subsetting and Aggregation Customized on-demand HDF product generation is desired based on the user’s query. For example, “Give me all L 2 Ozone data at Seoul from 2002 to 2013 and allow me to download it as a single HDF file. ” Most HDF data products are packaged in daily granule for large region. Search result returns thousands of HDF files and users cannot download them one by one. July 9, 2013 ESIP 2013 Summer Meeting 32 www. hdfgroup. org
Reverb Query Result for AIRS at Seoul Showing 1 to 9 of 5, 047 granules July 9, 2013 ESIP 2013 Summer Meeting 33 www. hdfgroup. org
Sharing Research Data How can users easily compose and publish new research data from the different NASA data product sources? “I’d like to combine AIRS Ozone and OMI Ozone data at Seoul from 2002 -2013 and share it with journal editors. ” Can this be shared as a single URL query to NASA data cloud? July 9, 2013 ESIP 2013 Summer Meeting 34 www. hdfgroup. org
Thanks! Questions / Comments? eoshelp@hdfgroup. org July 9, 2013 ESIP 2013 Summer Meeting 35 www. hdfgroup. org
Acknowledgements This work was supported by Subcontract number 114820 under Raytheon Contract number NNG 10 HP 02 C, funded by the National Aeronautics and Space Administration (NASA) and by cooperative agreement number NNX 08 AO 77 A from the NASA. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of Raytheon or the National Aeronautics and Space Administration. July 9, 2013 ESIP 2013 Summer Meeting 36 www. hdfgroup. org