DATA ANALYSIS USING PYTHONI Baburao Kamble and Ayse

DATA ANALYSIS USING PYTHON-I Baburao Kamble and Ayse Kilic University of Nebraska-Lincoln GIS in Water Resources Lecture, 2014

11/30/2020 Objectives To be able to understand write Python scripts Numerical, Text (Weather) &Geospatial (Environmental) data manipulation

11/30/2020 Agenda Part I: Basic Python Programming • Python Syntax, Strings, Array, Conditional and Control flow, Functions • File Input/Output and Text Data Processing Part II: Geospatial Data Analysis • Installation of GDAL bindings on windows operating system • Reading Raster (Landsat and DEM) data • Using GDAL function from command line (interpolation, translation) • Raster Processing. • NDVI calculation • DN 2 Rad 2 Ref

Why use GDAL/Num. Py instead of canned GIS software? • Not advisable if what you want to do is easily handled within Arc. GIS/Imagine/etc. – there is a lot of programming overhead • Well suited for process model applications where the logic at a cell based is too complex • Example: • Grid algebra : grid 1 + grid 2 (probably use GIS) • Finding NN in multidimensional space (maybe use GDAL/Numpy) • Also useful if your spatial data is NOT standard GIS formats (JPEG, PNG, etc. )

Geoprocessing with GDAL and Numpy in Python • GDAL - Geospatial Data Abstraction Library • Numpy - the N-dimensional array package for scientific computing with Python. • Both of them are open source software Read raster dataset using GDAL Do some calculation using Numpy Output to geospatial dataset using GDAL

Python Libraries for Geospatial Development • There are two popular Python libraries for raster and vector data: • GDAL (Geospatial Data Abstraction Library) • Is for raster-based geospatial data; aailable for download at: http: //gdal. org • Windows user use Frank Warmerdam’s “FWTools” open source GIS binaries for Windows (32 bit); available at: http: //fwtools. maptools. org # includes gdal and ogr • OGR • Is for vector-based geospatial data and is available for download at: http: //gdal. org/ogr • NOTE: These are now merged together and are downloaded together under the common name GDAL

GDAL (Geospatial Data Abstraction Library) • Presents an “abstract data model” for processing spatial data • Can be used directly from C/C++ and can be “wrapped” for use with Python, Perl, VB, C#, R, Java … • Early developers have chosen Python as their scripting language and documentation is relatively good for this.

UGIC 2009 GDAL • Geospatial Data Abstraction Library • Raster data access • Supports about 100 different formats • Arc. Info grids, Arc. SDE raster, Imagine, Idrisi, ENVI, GRASS, Geo. TIFF • HDF 4, HDF 5 • USGS DOQ, USGS DEM • ECW, Mr. SID • TIFF, JPEG 2000, PNG, GIF, BMP

Num. Py (Numerical Python) • An array/matrix package for Python • Well suited for image processing – i. e. one function can operate on the entire array • Slicing by dimensions and applying functions to these slices is concise and straightforward • Nearly 400 methods defined for use with Num. Py arrays (e. g. type conversions, mathematical, logical, etc. )

GDAL and Num. Py • Since GDAL 1. 3(? ), GDAL has implemented NG (New Generation) Python bindings which includes Num. Py • Process: Get raster band(s) Open GDALDataset Write out GDALDataset Convert the Num. Py array to GDAL raster bands using Write. As. Array() Convert the raster band(s) to a Num. Py array using Read. As. Array() Process the raster band(s) using Num. Py functionality

11/30/2020 GDAL • GDAL installation • Make sure Num. Py is installed (http: //sourceforge. net/projects/numpy/files/) • Download gdalwin 32 exe 160. zip from http: //download. osgeo. org/gdal/win 32/1. 6/ • Unzip to C: gdalwin 32 -1. 6 11

11/30/2020 Test Your GDAL in Python 12

11/30/2020 GDAL Supports more than 100 raster formats http: //www. gdal. org/formats_list. html Finding available formats gdalinfo --formats 13

GDAL Data Model • GDAL’s data model includes: • Dataset, which hold the raster data in a collection of raster bands • Raster Band: represents a band, channel, or layer within an image (e. g. , RGB image has red, green, and blue components of the image) • Raster size: specifies the width of the image in pixels and overall height of the image in lines • Georeferencing transform converts from x, y coordinates into georeferenced coordinates (on the surface of Earth) • Affine transformation mathematical formula allowing operations such as X offset, Y offset, X scale, Y scale, horizontal shear, vertical shear • Coordinate system includes the projection and datum • Metadata provide additional information about the dataset

GDAL example Python code • Use GDAL to calculate the average of the height values contained in a DEM: from osgeo import gdal, gdalconst dataset = gdal. Open(”DEM. dat”) band = dataset. Get. RAster. Band(1) fmt= “<“ + (“h” * band. XSize) tot. Height = 0 for y in range (band. Ysize): scanline = band. Read. Raster (0, y, band. Xsize, 1, band. Data. Type) values = struct. unpack(fmt, scanline) for value in values: tot. Height = tot. Height + value average = tot. Height / (band. x. Size * band. YSize) print “Average height = “, average

OGR • Datasource: represents the file (e. g. , a country), has many: • Layers: sets of related data, e. g. , terrain, contour lines, roads layers • Spatial reference: specifies datum and projection • Attributes: additional metadata about the feature • Geometry: specifies the shape and location of the feature. Are recursive, i. e. , can have sub-geometries

OGR example Python code import osgeo. ogr from osgeo import ogr shapefile = ogr. Open(“TM_WORLD_BORDERS-0. 3. shp”) layer = shapefile. Get. Layer (0) for i in range (layer. Get. Feature. Count()): feature = layer. Get. Feature(i) name = feature. Get. Field(“NAME”) geometry = feature. Get. Geometry. Ref() print i, name, geometry. Get. Geometry. Name()

Sample 1 • Read two tif files (red band nir band) • Calculate • Output NDVI in same projection and georeference as the input file. • Numpy example

19 11/30/2020 Raster File input/output read_raster. py read_Write_raster_in_Block. py NDVI. py DN 2 rad 2 ref. py

GDAL Command Line Utilities gdalinfo - report information about a file. gdal_translate - Copy a raster file, with control of output format. gdaladdo - Add overviews to a file…pyramids gdalwarp - Warp an image into a new coordinate system. gdal_contour - Contours from DEM. gdaldem - Tools to analyze and visualize DEMs. rgb 2 pct. py - Convert a 24 bit RGB image to 8 bit paletted. pct 2 rgb. py - Convert an 8 bit paletted image to 24 bit RGB. dal_merge. py - Build a quick mosaic from a set of images. gdal_rasterize - Rasterize vectors into raster file. gdaltransform - Transform coordinates. nearblack - Convert nearly black/white borders to exact value. gdal_grid - Create raster from the scattered data. gdal_polygonize. py - Generate polygons from raster. gdal_sieve. py - Raster Sieve filter. gdal_fillnodata. py - Interpolate in nodata regions. gdal-config - Get options required to build software using GDAL.

- Slides: 21