Bayesian Hierarchical Modeling of Hydroclimate Problems Balaji Rajagopalan

Bayesian Hierarchical Modeling of Hydroclimate Problems Balaji Rajagopalan Department of Civil, Environmental and Architectural Engineering And Cooperative Institute for Research in Environmental Sciences (CIRES) University of Colorado Boulder, CO, USA Bayes by the Bay Conference, Pondicherry January 7, 2013

Co-authors & Collaborators n n n Upmanu Lall and Naresh Devineni – Columbia University, NY Hyun-Han Kwon, Chonbuk National University, South Korea Carlos Lima, Universidade de Brasila, Brazil Pablo Mendoza James Mc. Creight & Will Kleiber – University of Colorado, Boulder, CO Richard Katz – NCAR, Boulder, CO NSF, NOAA, USBReclamation and Korean Science Foundation

Outline n n n Bayesian Hierarchical Modeling n Introduction from GLM Hydroclimate Applications n BHM n Contrast with near Bayesian models currently in vogue Stochastic Rainfall Generator n BHM (Lima and Lall, 2009, WRR) n Latent Gaussian Process Model (Kleiber et al. , 2012, WRR) Riverflow Forecasting (Kwon et al. , 2009, Hydrologic Sciences) n Seasonal Flow n Flow extremes Paleo Reconstruction of Climate (Devineni and Lall, 2012, J. Climate)

Linear Regression Models Suppose the model relating the regressors to the response is In matrix notation this model can be written as

Linear Regression Models where

Linear Regression Models We wish to find the vector of least squares estimators that minimizes: The resulting least squares estimate is

12 -1 Multiple Linear Regression Models 12 -1. 4 Properties of the Least Squares Estimators Unbiased estimators: Covariance Matrix:

12 -1 Multiple Linear Regression Models 12 -1. 4 Properties of the Least Squares Estimators Individual variances and covariances: In general,

Generalized Linear Model (GLM) Bayesian Perspective • Linear Regression is not appropriate • when the dependent variable y is not Normal • Transformations of y to Normal are not possible • Several situations (rainfall occurrence; number of wet/dry days; etc. ) • Hence, GLM • Linear model is fitted to a ‘suitably’ transformed variable of y • Linear model is fitted to the ‘parameters’ of the assumed distribution of y Likelihood

Generalized Linear Model (GLM) Bayesian Perspective Exponential family PDF, parameters All distributions Arise from this Normal, Exponential, Gamma Binomial, Poisson, etc • Noninformative prior on β • Assuming Normal distribution for Y, g (. ) is identity Linear Regression

Generalized Linear Model (GLM) Bayesian Perspective • Log and logit – Canonical Link Functions

Generalized Linear Model (GLM) Bayesian Perspective

Generalized Linear Model (GLM) Bayesian Perspective Inverse Chi-Square

Generalized Linear Model (GLM) Bayesian Perspective

Summary • GLM is hierarchical • Specific Distribution • Link function • With a simple step – i. e. , Providing priors and computing likelihood/posterior BHM • Assuming Normal distribution of dependent variable and uninformative priors • BHM collapses to a standard Linear Regression Model • Thus BHM is a generalized framework • Uncertainty in the model parameters and model Structure automatically obtained.

Generalized Linear Model (GLM) Example - Bayesian Hierarchical Model • Hard to sample from posterior - Use MCMC

Stochastic Weather Generators Precipitation Occurrence, Rain Onset Day (Lima and Lall, 2009) Precipitation Occurrence and Amounts (Kleiber, 2012)

• Users most interested in sectoral/process outcomes (streamflows, crop yields, risk of disease X, etc. ) • Need for a robust spatial weather generator Historical Synthetic series – Conditional on Climate Information Data 28. 5 23. 1 29. 1 25. 8 … … … 12. 4 10. 2 11. 4 9. 7 … Process model Frequency distribution of outcomes

Need for Downscaling n n Seasonal climate forecasts and future climate model projections often have coarse scales: n Spatial: regional n Temporal: seasonal, monthly Process models (hydrologic models, ecological models, crop growth models) often require daily weather data for a given location n There is a scale mismatch! n Stochastic Weather Generators can help bridge this scale gap.

Precipitation Occurrence n n 504 stations in Brazil (Latitude & Longitude shown in figure) Lima and Lall (WRR, 2009) Modeling of rainfall occurrence (0 = dry, 1 = rain, P = 0. 254 mm threshold) using a probabilistic model (logistic regression):

Modeling Occurrence at a Site where yst(n) is a non-homegeneous Bernoulli random variable for station s, day n and year t, being either 1 for a wet state or 0 for a dry state. • pst(n) is the rainfall probability for station s and day n of year t. The seasonal cycle is modeled through Fourier harmonics:

Results from Site #3 Outlier?

Bayesian Hierarchical Model (BHM) But rainfall occurrence is correlated in space – how to model? - partial BHM • Shrinks paramters towards a common mean, reduce uncertainty since we are use more information to estimate model parameters; • Parameter uncertainties are fully accounted during simulations

Bayesian Hierarchical Model (BHM) Likelihood Function Posterior Distribution – Bayes theorem MCMC to obtain posterior distribution

Results for Station #3 – Yearly Probability of Rainfall

Results Station #3 - Average Probability of Rainfall

Clusters on average day of max probability Day of Max Probability of Rainfall • Max Probability of rainfall correlated With climate variables – ENSO, etc. • Characterize rainfall ‘onset’ • Prediction of ‘onset’ • Lima and Lall (2009, WRR) Max Probability of Rainfall

Space-time Precipitation Generator Latent Gaussian Process (Kleiber et al. , 201, WRR)

Latent Gaussian Process n Fit a GLM for Precipitation Occurrence and amounts at each location independently n n n Occurrence logistic regression-based Amounts Gamma link function Spatial Process to smooth the GLM coefficients in space Almost Bayesian Hierarchical Modeling Alpha, gamma – shape and scale parameter of Gamma

Occurrence Model Latent Gaussian Process

Latent Gaussian Process n Parameter Estimation MLE, two step

GLM + Latent Gaussian Process Kleiber et al. (2012)

For Max and Min Temperature Models Conditioned on Precipitation Model - Using Latent Gaussian Process Kleiber et al. (2013, Annals of App. Statistics, in press)

Outline n n n Bayesian Hierarchical Modeling n Introduction from GLM Hydroclimate Applications n BHM n Contrast with near Bayesian models currently in vogue Stochastic Rainfall Generator n BHM (Lima and Lall, 2009, WRR) n Latent Gaussian Process Model (Kleiber et al. , 2012, WRR) Riverflow Forecasting (Kwon et al. , 2009, Hydrologic Sciences) n Seasonal Flow n Flow extremes Paleo Reconstruction of Climate (Devineni and Lall, 2012, J. Climate)

Seasonal average and maximum Streamflow Forecasting (Kwon et al. , 2009, Hydrologic Sciences)

Streamflow Forecasting at Three Gorges Dam Identify Predictors • Correlate seasonal streamflow with large scale climate variables from preceding seaons • JJA flow with MAM climate • Select regions of strong (Grantz et al. , 2005) correlation • predictors

Streamflow Forecasting at Three Gorges Dam

BHM for Seasonoal Streamflow n Model Data showed mild nonlinearity Quadratic terms in the model is distributed as half-Cauchy with parameter 25 “mildly informative” Gelman (2006, Bayesian Analysis) MCMC is used to obtain the posterior distributions

Streamflow Forecasting at Three Gorges Dam Predictors 2, 3, 4 and 5 Show tighter Bounds Uncertainty in predictors (i. e. model) is obtained and propogated in the forecacsts You can use PCA or stepwis etc. to reduce the number of predictors (this can be crude)

Streamflow Forecasting at Three Gorges Dam

Maximum Seasonal Streamflow Extreme Value Analysis – Floods (Kwon et al. , 2010, Hydrologic Sciences)

American River at Fair Oaks - Ann. Max. Flood 100 yr flood estimated from 21 & 51 yr moving windows

Floods n n The time varying (nonstationary) nature of hydrologic (flood) frequency (few examples) n Climate Variability and Climate Change n Climate Mechanisms that lead to changes in flood statistics Adaptation Strategy n ‘Adaptive’ Flood Risk Estimation n Nonstationary Flood Frequency Estimation n Seasonal to Inter-annual Forecasts & Climate Change n Improved Infrastructure Management n Summary / Climate Questions and Issues related to Hydrologic Extremes

Flood Variance given DJF NINO 3 and PDO Flood mean given DJF NINO 3 and PDO NINO 3 PDO Derived using weighted local regression with 30 neighbors Correlations: Log(Q) vs DJF NINO 3 -0. 34 Jain & Lall, 2000 vs DJF PDO -0. 32

Atmospheric River generates flooding CZD Russian River, CA Flood Event of 18 -Feb-04 Slide from Paul Neiman’s talk Russian River flooding in Monte Rio, California 18 February 2004 IWV (cm) GPS IWV data from near CZD: 14 -20 Feb 2004 Atmospheric river Cloverdale 10” rain at CZD in ~48 hours Russian River, CA Flood Event IWV (inches) IWV (cm) Bodega Bay photo courtesy of David Kingsmill

Flood Estimation Under Nonstationarity n Significant interannual/interdecadal variability of floods n n n Stationarity assumptions (i. i. d) are invalid Large scale climate features in the Ocean. Atmosphere-Land system orchestrate floods at all time scales Need tools that can capture the nonstationarity n n Incorporate large scale climate information Year-to-Year time scale (Climate Variability) n n Flood mitigation planning, reservoir operations Interdecadal time scale (Climate Variability and Change) n Facility design, planning and management

Exponential (light, shape = 0), Pareto (heavy, shape > 0) and Beta (bounded, shape < 0)

Generalized extreme value (GEV) can be used to characterize extreme flow distribution (Katz et al. , 2002) 3 Model parameters Location parameter: Scale parameter: Shape parameter: (where distribution is centered) (spread of the distribution) (behavior of distribution tail) Gumbell, Frischet, Weibull “Unconditional” GEV (Coles 2001)

Incorporate covariates into GEV parameters to account for nonstationarity Could apply to any parameter, but location is most intuitive: GLM Framework Hierarchical Bayesian Modeling natural and attractive alternative

GEV fit using ext. Remes toolkit in R (Gilleland Katz, 2011) http: //www. isse. ucar. edu/extremevalues/extreme. html (Gilleland Katz 2005)

Streamflow Forecasting at Three Gorges Dam

BHM for Seasonal Maximum Flow n Model Data showed mild nonlinearity Quadratic terms in the model is distributed as half-Cauchy with parameter 25 “mildly informative” Gelman (2006, Bayesian Analysis) MCMC is used to obtain the posterior distributions

Streamflow Forecasting at Three Gorges Dam Predictors 3 and 5 Show tighter Bounds

Streamflow Forecasting at Three Gorges Dam

Nonstationary Flood Risk at Three Gorges Dam Dynamic 50 -yea flood from BHM and Stationary 50 -year flood

Conditional (nonstationary) Extremes in Water Quality (Towler et al. , 2009, WRR)

Case study location: PWB Towler et al. (2009) “Forest to Faucet” - Rain -Runoff -Storage (2 reservoirs) -Chemical Disinfection (Cl 2, NH 3) -Distribution -No physical filtration (“unfiltered”)

Case study location: PWB Precipitation events High Flows Exceedances (SWTR criterion: turbidity < 5 NTU) Back-up groundwater source (Pumping $$)

GEV Model Uncond Cond. T Cond. RT Cond. R+T β 0+β 1 T β 0+β 1 R β 0+β 1(RT) β 0+β 1 R+β 2 T β 0 (se) 1924 (120) 1930 (1000) 1739 (410) 611. 4 (150) 1911 (880) β 1 (se) - -0. 8914 (27) 61. 08 (32) 3. 716 (0. 36) 141. 2 (14) β 2 (se) - - -36. 45 (24) σ (se) 1245 (84) 1220 (81) 1246 (160) 923. 7 (69) 968. 5 (74) ξ (se) -0. 02246 (0. 065) -0. 01286 (0. 065) -0. 06180 (0. 084) 0. 07009 (0. 082) 0. 01619 (0. 075) llh -1289 -1274 -1250 K 1 2 2 2 3 AIC 2580 2582 2552 2504 2506 M 0* - Uncond Cond. R D - 0 30 78 48 Sig** - No (0. 635) Yes (0. 000) ρ*** - - 0. 5516 0. 5989 0. 5918 Variable * Nested model to which model is compared in likelihood ratio test ** Significance is tested at α=0. 05 level, and ( ) indicates p-value. *** Correlation between the cross-validated z 90 estimates and the observed maximum values

0 Maximum Streamflow (cfs) 2000 4000 6000 8000 Conditional quantiles correspond well to observed record 1970 1980 1990 Year 2000 Uses concurrent climate, but could also be used with seasonal forecast

GEV distribution can be compared for specific historic times

P and T climate change projections from IPCC AR 4 are readily available 12 km 2 resolution (1/8 of a grid cell) http: //gdo-dcp. ucllnl. org/downscaled_cmip 3_projections/#Welcome Bias correct P & T to historic data for PWB watershed area

Results indicate increasing maximum streamflow anomalies Observed 16 GCM models Maximum Streamflow Anomaly (%) -25 0 25 50 75 GCM model average 1950 2000 Year 2050 2100

Streamflow quantiles shift higher under CC projections Observed 16 GCM models

Conditional P(E) Likelihood of Turbidity Spike Probability of a turbidity spike given a certain maximum flow Maximum Flow (CFS) (Ang and Tang 2007)

Likelihood of a turbidity spike increases under CC projections Observed 16 GCM models

Likelihood of a turbidity spike increases 1950 -2007 2070 -2099 95 th (top whisker) 13 28 75 th (box top) 6. 3 11 50 th (box middle) 4. 2 5. 9 P(E) Percentile

Small shifts in risk can result in high expected loss Expected loss can be high, especially for the risk averse

Summary • Bayesian Hierarchical Modeling • Powerful tool for all functional (regression) estimation problems (which is most of forecasting/simulation) • Provides model and parameter uncertainties • Obviates the need for discarding covariates • Enables incorporation of expert opinions • Enables modeling a rich variety of variable types • Continuous, skewed, bounded, categorical, discrete etc. • And distributions (Binomial, Poisson, Gammma, GEV) • Generalized Framework • Traditional linear models are a subset

Paleo Hydrology Reconstruction Devineni and Lall, 2012, J. Climate accepted

Motivation Paleo Hydrology Colorado River Example UC CRSS stream gauges LC CRSS stream gauges Colorado River Demand - Supply

Streamflow and Tree Ring Data

Streamflow and Tree Ring Data Average Summer (JJA) Flows as Predictand Annual Tree Ring Growth Index (Chronology) as Predictor – 246 years common data 1754 1999 1903 1937 1950 Summer Flow = f(tree rings) + error 246 years chronology (Xt) (8 tree ring chronologies) 1999 variable length streamflow record (Yt) (5 sites)

Preliminary Data Analysis – Bayesian Hypothesis (correlation – tree chronology Vs average summer seasonal flow) Station-tree correlations similar! - pooling? Hypothesis No Shrinkage of Regression Coefficients (no pooling) traditional regression (a) Shrinkage of Regression Coefficients across sites (partial pooling) hierarchical model

Bayesian Hierarchical Models Streamflow Log Normal Distribution Regression Coefficients (β) of the hierarchical model - multivariate normal distribution Partial Pooling – Hierarchical Model Shrinkage on the coefficients to incorporate the predictive ability of each tree chronology on multiple stations Key ideas: 1. Streamflow at each site comes from a pdf 2. Parameters of each pdf informed by each tree 3. Common multivariate distribution of parameters across trees 4. Noniformative prior for parameters of multivariate distribution 5. MCMC for parameter estimation

Delaware River Reconstruction and Performance Models Developed • Hierarchical Bayesian Regression (Partial Pooling) • Linear Regression (No Pooling) Model Simulations • Win. BUGS : Bayesian Inference Using Gibbs Sampler • 7500 simulations with 3 chains and convergence tests. Cross Validated Performance Metrics • Reduction of Error (RE), Coefficient of Efficiency (CE)

Delaware River Reconstruction and Performance Posterior PDF (Model Level 1)

Delaware River Reconstruction and Performance Regression Coefficients Model Level 2 No Pooling Partial Pooling

Delaware River Reconstruction Cross-Validated Performance Canonsville Pepacton

Paleo Hydrology Reconstruction Traditional Methods Linear/Nonlinear Regression PCA of Tree Rings Regression on leading PCs

Slide 88 of 49 Objective 1: Tree-ring Reconstructions LCBR Naturalize streamflow 9 nodes in CRSS 5 are well correlated with precipitation (>0. 5) Referred to as “good nodes” (blue) 4 are not correlated (<0. 1) Referred to as “noise nodes” (yellow)

Slide 89 of 49 Tree-Ring Reconstruction Approaches Multiple Linear Regression Individual chronologies are added in a stepwise fashion Principle Component Linear Regression Eliminates multicollinearity Parsimonious model since the majority of the variance is represented in fewer variables. K-nearest neighbor nonparametric approach No assumption of distribution Captures nonlinearities Removes undue influence of outliers

Slide 90 of 49 New Approach Cluster analysis on the tree-ring chronologies to find distinct, coherent climate signals. K-means clustering approach Increases the amount of climate signal that can be extracted Perform PCA on each cluster, provide the leading PCs from each cluster as potential predictors Signal that may have been washed out during PCA on the entire pool of predictors is preserved

Slide 91 of 49

Slide 92 of 49 Regression Methods Present two regression methods to add to the treering reconstruction repertoire Local Polynomial regression. Extreme Value Analysis (EVA)

Slide 93 of 49 Method 1: Local Polynomial Regression Find the K-nearest neighbors, fit a polynomial to the neighborhood Polynomials are fitted in the GLM framework, where Y can be of any distribution in the exponential family (normal, gamma, binomial, etc) G(E(Y))=f(Y)+ G(. ) = link function, X = set of predictors/independent variables E(Y) is the expected value of the response/dependent variable is the error, assumed to be normally distributed Improvement over K-NN resampling Values beyond those found in the historical record can be generated

Slide 94 of 49

Slide 95 of 49

Summary • Bayesian Hierarchical Modeling • Powerful tool for all functional (regression) estimation problems (which is most of forecasting/simulation) • Provides model and parameter uncertainties • Obviates the need for discarding covariates • Enables incorporation of expert opinions • Enables modeling a rich variety of variable types • Continuous, skewed, bounded, categorical, discrete etc. • And distributions (Binomial, Poisson, Gammma, GEV) • Generalized Framework • Traditional linear models are a subset