Using Python to Retrieve and Visualize Data Part

  • Slides: 14
Download presentation
Using Python to Retrieve and Visualize Data (Part 2 of 2) Jeff Horsburgh Hydroinformatics

Using Python to Retrieve and Visualize Data (Part 2 of 2) Jeff Horsburgh Hydroinformatics Fall 2017 This work was funded by National Science Foundation Grants EPS 1135482 and EPS 1208732

Quick Review • Last class we covered – How to create a time series

Quick Review • Last class we covered – How to create a time series plot that reads data directly from an ODM My. SQL database – How to generalize the script to work for any Site. ID, Variable. ID, Start. Date. Time, and End. Date. Time • Today’s Agenda: – – Go over the “final” script from last class’ material Show to create a figure with multiple subplots Show to use functions to organize your code Briefly introduce the Pandas package for data analysis in Python

Solution to Challenge Problems from Last Class • Open Class 15_Example_Final. py for a

Solution to Challenge Problems from Last Class • Open Class 15_Example_Final. py for a solution to the challenge problems presented at the end of last class’s lecture • Let’s walk through the code

One Figure, Multiple Subplots • It is often valuable to include multiple subplots on

One Figure, Multiple Subplots • It is often valuable to include multiple subplots on a single figure.

One Figure, Multiple Subplots • You can do this by 1) creating a figure

One Figure, Multiple Subplots • You can do this by 1) creating a figure object, and 2) using the add_subplot method on the figure object as shown below. import matplotlib. pyplot as plt # Create the plot figure that we will add 2 panels to fig = plt. figure() ax = fig. add_subplot(2, 1, 1) – First number is number of rows – Second number is number of columns – Third number is the specific plot assigned to the axes object ‘ax’ # Then plot the first subplot ax. plot(local. Date. Times, data. Values, color='grey', linestyle='solid', markersize=0)

Challenge Problem • Extend Class 15_Example_Final. py to have at least two subplots (temperature

Challenge Problem • Extend Class 15_Example_Final. py to have at least two subplots (temperature at 2 sites) • Take ~ 5 minutes to think about this, and then I will present a solution.

Solution See Class 16_Example_Subplots. py in Canvas

Solution See Class 16_Example_Subplots. py in Canvas

Using Functions to Reduce Repetitive Code • Syntax for a function in Python: def

Using Functions to Reduce Repetitive Code • Syntax for a function in Python: def my_function(arg 1, arg 2): x = arg 1 + arg 2 return x • This function takes two arguments, adds them, then returns the sum.

Example of Creating Subplots using Functions • See Class 16_Example_Functions. py for an example

Example of Creating Subplots using Functions • See Class 16_Example_Functions. py for an example of how to use functions to solve the challenge problem. • We will walk through this code in class.

Introduction to the Pandas Package • Includes classes designed specifically for data analysis and

Introduction to the Pandas Package • Includes classes designed specifically for data analysis and visualization including – Data. Frame – Series • See Class 16_Demo. Pandas. py for an example of how to plot a time series using the Pandas Library. • I found this blog post to be a very helpful introduction to some of the time series functionality in Pandas.

Resulting Image from Example Pandas Script Take home message: Creating a similar figure using

Resulting Image from Example Pandas Script Take home message: Creating a similar figure using only matplotlib is possible, but would take many more lines of code.

Assignment 5 • Now you get to try this on your own. • Build

Assignment 5 • Now you get to try this on your own. • Build from the examples provided in class (and/or others you find online) to create your own ‘publication ready figure’ from the data stored in the loganriverodm database • Exactly what plot you create is up to you • Details for the assignment are provided on Canvas

Summary • I hope I convinced you that – Reproducibility matters and it should

Summary • I hope I convinced you that – Reproducibility matters and it should be a goal of your data analysis and visualization steps – Using Py. My. SQL and matplotlib, it is possible automate data visualizations with a script – There are many ways to organize your code and functions • Messy code is quick to write, but it may come back to haunt you if it is part of a larger project – Pandas library provides classes that make it simpler to handle data analysis and visualization processes. • I hope that you now have some basic concepts that you can build from using online documentation and examples – Like all things, practice makes perfect.

Resources • matplotlib gallery of examples – http: //matplotlib. org/gallery. html • Pandas compared

Resources • matplotlib gallery of examples – http: //matplotlib. org/gallery. html • Pandas compared to SQL – http: //pandas. pydata. org/pandasdocs/stable/comparison_with_sql. html • Pandas compared to R – http: //pandas. pydata. org/pandasdocs/stable/comparison_with_r. html