ESMPy and Open Climate GIS Python Interfaces for

  • Slides: 31
Download presentation
ESMPy and Open. Climate. GIS: Python Interfaces for High Performance Grid Remapping and Geospatial

ESMPy and Open. Climate. GIS: Python Interfaces for High Performance Grid Remapping and Geospatial Dataset Manipulation Ryan O’Kuinghttons, Ben Koziol, Robert Oehmke Cecelia De. Luca, Gerhard Theurich Peggy Li, Joseph Jacob Cooperative Institute for Research in Environmental Sciences NOAA Environmental Software Infrastructure and Interoperability Project European Geosciences Union General Assembly Vienna, Austria April 22, 2016

ESMF and ESMPy • The Earth System Modeling Framework (ESMF) is open source software

ESMF and ESMPy • The Earth System Modeling Framework (ESMF) is open source software for building modeling components, and coupling them together to form weather prediction, climate, coastal, and other applications. • Provides infrastructure for time management, data communications, metadata and I/O, running models as web services, grid remapping • Supports a full Fortran and limited C and Python interfaces • ESMF provides a mature high performance regridding package • Transforms data from one grid to another by generating and applying interpolation weights • Supports structured and unstructured, global and regional, 2 D and 3 D grids, with many options • Fully parallel and highly scalable • The Python interface to ESMF (ESMPy) offers access to the regridding functionality and other related features of ESMF.

OCGIS • Open. Climate. GIS (OCGIS) is a standalone Python package enabling dynamic access

OCGIS • Open. Climate. GIS (OCGIS) is a standalone Python package enabling dynamic access to and manipulation of high resolution climate data • Subsetting, coordinate transformations, temporal averaging, and other computations • Data conversions between CSV, Shapefile, GRIDSPEC, and UGRID • Data conversions between ESMPy and OCGIS bring together GIS capabilities with high performance regridding functionality to create a more unified set of Python tools for Earth system modeling • One area of interest is connecting high resolution hydrological models with the high performance climate models

ESMPy Overview • High performance regridding is applied as a callable Python object •

ESMPy Overview • High performance regridding is applied as a callable Python object • Num. Py array access to distributed data (parallelism for FREE) • Many regridding methods including first-order conservative • Data objects can be created from Net. CDF files in standard metadata formats • Supported grids and methods for regridding with ESMPy include: • • Bilinear, higher order patch [1, 2], first order conservative[3], or nearest neighbor regridding Global or regional 2 D or 3 D logically rectangular Grids 2 D or 3 D unstructured Meshes composed of triangles, quadrilaterals or hexahedrons 1 D streams of observational data or unconnected sets of points (Loc. Stream) 2 D Unstructured Mesh From www. ngdc. noaa. gov FIM Unstructured Grid Regional Grid

Open. Climate. GIS Overview • • • Developed by the NESII Group in association

Open. Climate. GIS Overview • • • Developed by the NESII Group in association with the NCPP Project under funding provided by the NOAA Climate Program Office. Python package designed to ease the “localization” and accessibility of highdimensional scientific datasets Primary Features: geospatial subsetting, standardized calculation, bundling, format conversion, access to Open. DAP datasets. Additional dependencies: • GDAL, Shapely, Fiona, net. CDF 4, osgeo https: //www. earthsystemcog. org/projects/openclimategis/ https: //github. com/NCPP/ocgis

ESMPy – OCGIS Integration • ESMPy and OCGIS have complementary capabilities • OCGIS allows

ESMPy – OCGIS Integration • ESMPy and OCGIS have complementary capabilities • OCGIS allows access to and manipulation of high resolution data sets • ESMPy provides high performance regridding and access to distributed Num. Py data • There are several ways to create an integrated workflow • OCGIS can preprocess data files and convert between data formats • ESMPy Field object is an output format of OCGIS • ESMPy can read OCGIS outputs (Net. CDF) in parallel, for high performance regridding • OCGIS offers serial regridding using ESMPy • Parallel processing requires clever use of integrated capabilities… • OCGIS is implemented and used in single processor mode • ESMPy is fully parallel IF objects are created in parallel • Conversion between serial and distributed objects is next. .

Integrated Workflow Example ** Green text indicates steps that can be done in serial

Integrated Workflow Example ** Green text indicates steps that can be done in serial or parallel 1: Preprocess files using OCGIS (subsetting) 2: Read distributed ESMPy objects Object processor ID Data file 3: Compute and apply regridding weights 0 1 2 3 4: Write parallel object to files for use by downstream applications Object processor ID 0 1 2 3 Data file • ESMF command line application allows parallel regrid weight generation with output to file-based output in a single step

Supported Data Conventions ESMPy grid files use the following standard data file formats: •

Supported Data Conventions ESMPy grid files use the following standard data file formats: • Climate and Forecast (CF) grid conventions • UGRID - candidate CF convention for unstructured grids [3], used to represent grids with arbitrary polygons with no gaps • GRIDSPEC – accepted CF convention for logically rectangular grids [4] • SCRIP – Spherical Coordinate Remapping and Interpolation Package [5] • Legacy format for 2 D logically rectangular or 2 D unstructured grids • ESMF • Custom format for unstructured grids, more efficient storage than SCRIP or CF when used with ESMF codes OCGIS has a rich set of conversion routines between the following: • CF grid conventions (above) • Shapefile – geospatial vector data format used by GIS software [6] • CSV – comma separated value

Interfaces ESMPy has objects for data (Field) and underlying distribution (Grid/Mesh): • Grid -

Interfaces ESMPy has objects for data (Field) and underlying distribution (Grid/Mesh): • Grid - logically rectangular discretization object grid=ESMF. Grid(filename=“gridspec. nc”, filetype=ESMF. File. Format. GRIDSPEC) grid=ESMF. Grid(max_index=numpy. array([7, 8, 9]), coord_sys=ESMF. Coord. Sys. CART) • Mesh - unstructured mesh discretization object mesh = ESMF. Mesh(filename=“ugrid. nc”, filetype=ESMF. File. Format. UGRID) • Field – data object built on a grid or mesh with optional mask • derived type of numpy. ndarray field = ESMF. Field(dstgrid, "dstfield”, meshloc=ESMF. Mesh. Loc. ELEMENT, ndbounds=[1, 365, 1]) OCGIS has a very compact interface for a wide range of capabilities: ops = ocgis. Ocg. Operations(dataset=rd, geom=path_ugid_shp, select_ugid=select_ugid, agg_selection=True, prefix='subset_nc', output_format='nc’, add_auxiliary_files=False)

Regridding r 1 to 2 = Regrid(field 1, field 2, regrid_method=Regrid. Method. CONSERVE) where:

Regridding r 1 to 2 = Regrid(field 1, field 2, regrid_method=Regrid. Method. CONSERVE) where: f(phi, theta) = 2 + cos(theta)**2 * cos(2*phi) Source grid: fv 1. 9 x 2. 5_050503. nc - 1. 9 x 2. 5 CAM finite volume grid Destination grid: wr 50 a_090614. nc - Regional 205 x 275 grid Mean relative error = Maximum relative error = Conservation error = 3. 19 E-03 1. 93 E-02 7. 11 E-15

Conservative Regridding • Conservative regridding is important in Earth system modeling to preserve the

Conservative Regridding • Conservative regridding is important in Earth system modeling to preserve the total integral of a field throughout the operation (e. g. water content) • The algorithm used by ESMF computes interpolation weights between cell i on the source grid and j on the destination grid using: where fij is the fraction of the source cell contributing to the destination cell and Ai and Aj are the relative areas of the source and destination cells. • Options exist for: • Using internally computed (default) or user supplied areas • Computing areas and distances using great-circle (default) or straight line distances on the surface of the sphere

Enabling Hydrological Studies • Hydrological impact studies can be improved when forced with data

Enabling Hydrological Studies • Hydrological impact studies can be improved when forced with data from climate models; hydrological feedbacks can affect climate • A technology and scale gap exists: • Many hydrological models have limited scalability, run on desktop computers, and have watershed-sized domains • Many climate models are highly parallel, run on high performance supercomputers and have global domains • However, scales are slowly converging (e. g. high resolution climate models, hydrological systems of greater extent) • Provides scientists opportunities to explore new coupled model configurations and modes of coupling • Provides programmers opportunities to develop tools to handle this coupling interface

High Resolution Data Task: Subset high resolution climate precipitation data to local scale and

High Resolution Data Task: Subset high resolution climate precipitation data to local scale and then regrid to catchment basins Source data: CF formatted precipitation data file for the continental United States on a logically rectangular grid (nldas_met_update. obs. daily. pr. 1990. nc) Output: Multi-dimensional precip values (including time) on a subset of catchment basins in region of interest after conservative regridding

High Performance Results • Test done on IBM i. Data. Plex (yellowstone) with 128

High Performance Results • Test done on IBM i. Data. Plex (yellowstone) with 128 and 256 cores • Source grid has 2, 647, 454 elements with up to 58396 nodes • Weight file generation takes minutes, application takes seconds Conservative regridding result with CONUS NHDPlus catchments using exact solution:

Status and Future Work • Both ESMPy and OCGIS are in production and fully

Status and Future Work • Both ESMPy and OCGIS are in production and fully supported • Upcoming development: • Read and write ESMF formatted weight files • Write ESMF Fields in parallel • Seamless conversions between serial and distributed objects in ESMPy • Python 3 support

Requirements, Supported Platforms, Limitations, etc. . . Requirements: ESMPy: - Python 2. 6, 2.

Requirements, Supported Platforms, Limitations, etc. . . Requirements: ESMPy: - Python 2. 6, 2. 7 - Numpy 1. 6. 1/2 (ctypes) - ESMF installation (with Net. CDF) Testing: - Nightly regression testing Supported Platforms: - Linux, Darwin, and Cray - Gfortran - Open. MP Installation: - OCGIS (additional dependencies): - net. CDF 4 - Shapely - Fiona - osgeo - Travis CI integration - Linux, Darwin, Windows ESMPy: python setup. py build --ESMFMKFILE=<path_to_esmf. mk> install OCGIS: python setup. py install conda install -c conda-forge esmpy ocgis

Selected Users • UV-CDAT (PCMDI) – Ultrascale Visualization Climate Data Analysis Tools • cfpython

Selected Users • UV-CDAT (PCMDI) – Ultrascale Visualization Climate Data Analysis Tools • cfpython (University of Redding) – Implementation of the CF data model for reading, writing and processing of data and metadata • Iris (Met Office) – Python library for visualizing meteorological and oceanographic data sets. • Py. Ferret (NOAA) – Python based interactive visualization and analysis environment • Community Surface Dynamics Modeling System (CU-Boulder) – Tools for hydrological and other surface modeling processes • OCGIS – climate 4 impact portal (IS-ENES): Tools for climate modelers to tailor high resolution climate data • OCGIS – Climate. Pipes (kitware): User- friendly data access, manipulation, analysis and visualization of community climate models

Contact Us! Email: esmf_support@list. woc. noaa. gov or ocgis_support@list. woc. noaa. gov Website: https:

Contact Us! Email: esmf_support@list. woc. noaa. gov or ocgis_support@list. woc. noaa. gov Website: https: //earthsystemcog. org/projects/esmpy/ or https: //earthsystemcog. org/projects/openclimategis/ References: 1. Khoei S. A. , Gharehbaghi A. R. , The superconvergent patch recovery technique and data transfer operators in 3 d plasticity problems. Finite Elements in Analysis and Design, 43(8), 2007. 2. Hung K. C, Gu H. , Zong Z. , A modified superconvergent patch recovery method and its application to large deformation problems. Finite Elements in Analysis and Design, 40(5 -6), 2004. 3. D. Ramshaw, Conservative rezoning algorithm for generalized two-dimension meshes. Journal of Computational Physics, 59, 1985 4. UGRID documentation: https: //github. com/ugrid-conventions, accessed Dec. 19, 2014 5. Grid. Spec whitepaper: https: //ice. txcorp. com/trac/modave/wiki/CFProposal. Gridspec, accessed Dec. 19, 2014 6. Jones, P. W. SCRIP: A Spherical Coordinate Remapping and Interpolation Package. http: //www. acl. lanl. gov/climate/software/SCRIP. Los Alamos National Laboratory Software Release LACC 98 -45 7. Shapefile whitepaper: http: //www. esri. com/library/whitepapers/pdfs/shapefile. pdf, accessed Dec. 19, 2014

Jupyter Notebooks

Jupyter Notebooks

ESMPy Regridding

ESMPy Regridding

ESMPy Regridding Plotting the solution with matplotlib shows error on the order of 10

ESMPy Regridding Plotting the solution with matplotlib shows error on the order of 10 -7

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

OCGIS Utilities https: //github. com/NCPP/ocgis/blob/master/examples/ipynb/USGSTech-Stack-20151114. ipynb

Implementation Details

Implementation Details

ctypes bindings to ESMF • ESMPy is connected to ESMF using ctypes bindings to

ctypes bindings to ESMF • ESMPy is connected to ESMF using ctypes bindings to the C interface Interfacing with ctypes: _ESMF. ESMC_Grid. Get. Coord. restype = ctypes. POINTER(ctypes. c_void_p) _ESMF. ESMC_Grid. Get. Coord. argtypes = [ctypes. c_void_p, ctypes. c_int, ctypes. c_uint, numpy. ctypeslib. ndpointer(dtype=numpy. int 32), ctypes. POINTER(ctypes. c_int)] grid. Coord. Ptr = _ESMF. ESMC_Grid. Get. Coord(grid. struct. ptr, coord. Dim, staggerloc, exclusive. LBound, exclusive. UBound, ctypes. byref(lrc)) # adjust bounds to be 0 based exclusive. LBound = exclusive. LBound - 1 Allocating Numpy array buffers for memory allocated in ESMF: buffer = numpy. core. multiarray. int_asbuffer( ctypes. addressof(pointer. contents), numpy. dtype(ESMF 2 Python. Type[self. type]). itemsize*size) array = numpy. frombuffer(buffer, ESMF 2 Python. Type[self. type]) Switching between Fortran and C array striding: array = numpy. reshape(array, self. size_local[stagger], order='F')