Very Basic Geostatistics Marginal and Conditional Probability Distributions

Very Basic Geostatistics

Marginal and Conditional Probability Distributions

A two-dimensional random vector Each of these is a random variable

Two random variables with a multi. Gaussian distribution x 2 x 1

Two random variables with a multi. Gaussian distribution x 2 Count the dots x 1

Two-dimensional probability density function probability x 1 x 2

Two random variables with a multi. Gaussian distribution x 2 Count the dots x 1

probability Marginal probability density function of x 1 Area under pdf = 1 x 1

probability Marginal probability density function of x 1 mean of x 1

probability Marginal probability density function of x 1 standard deviation of x 1 mean of x 1

Two random variables with a multi. Gaussian distribution x 2 Count the dots x 1

Marginal probability density function of x 2 standard deviation of x 2 mean of x 2 probability

Two random variables with a multi. Gaussian distribution x 2 x 1

Two random variables with a multi. Gaussian distribution x 2 = a x 1

Two random variables with a multi. Gaussian distribution x 2 = a Count the dots x 1

Two random variables with a multi. Gaussian distribution x 2 = a probability density function of x 1 conditional on x 2 = a x 1

Two random variables with a multi. Gaussian distribution x 2 conditional standard deviation of x 1 x 2 = a conditional mean of x 1 probability density function of x 1 conditional on x 2 = a x 1

Two random variables with a multi. Gaussian distribution x 2 conditional standard deviation of x 1 x 2 = b conditional mean of x 1 probability density function of x 1 conditional on x 2 = b x 1

Two random variables with a multi. Gaussian distribution conditional standard deviation of x 1 x 2 = c conditional mean of x 1 probability density function of x 1 conditional on x 2 = c x 1

To fully characterise a multi. Gaussian random vector we need to characterise the randomness of each. We also need to characterize their statistical interrelatedness.

probability To characterize x 1 standard deviation of x 1 mean of x 1

To characterize x 1 mean of x 1 variance of x 1 σ12

To characterize x 2 standard deviation of x 2 mean of x 2 probability

To characterize x 2 mean of x 2 variance of x 2 σ22

x 2 x 1

vector of means 2 D variance-covariance matrix

vector of means variance-covariance matrix The ability of x 1 to condition x 2 The ability of x 2 to condition x 1

Marginal probability density function of x 2 Depends on elongation of the pdf The ability of x 1 to condition x 2 The ability of x 2 to condition x 1

Marginal probability density function of x 2 Depends on angle of the pdf The ability of x 1 to condition x 2 The ability of x 2 to condition x 1

vector of means variance-covariance matrix

variance-covariance matrix C(x) = Properties of variance-covariance matrix: - • Symmetrical • Positive definite xt. C(x)x > 0 for any x ≠ 0

“Regionalized Random Variables”

Domain of a Groundwater Model kj ki km Many random variables ki i. e. a high dimensional random vector k

m-dimensional random vector

m-dimensional random vector m–dimensional mean vector

m-dimensional random vector m × m covariance matrix

m-dimensional random vector Variance = square of standard deviation m × m – covariance matrix

m-dimensional random vector Indicates spatial correlation between element of k m × m – covariance matrix

Domain of a Groundwater Model correlation length Assume (for convenience) that the spatial covariance between ki and kj is a function only of the distance between them (and possibly the direction between them) That is: Cij = C(hij)

m-dimensional random vector m × m – covariance matrix Filling of this matrix becomes easy.

Fundamental Property of a Covariance Matrix A covariance matrix must be positive definite: kt. C(k)k > 0 for any non-zero k. This ensures that all probabilities are positive and finite.

Once we have a covariance matrix we can do some “statistical things”…. 1. Generate (conditional) random realizations

A realization of k

Another realization of k

Conditioning one random variable by acquiring knowledge of another

Conditional realization of random vector k Direct measurements of ki

Another conditional realization of k Direct measurements of ki

Once we have a covariance matrix we can do some “statistical things”…. 2. Estimate values between measurement points Referred to as: “Estimation”, “Interpolation”, “Kriging”

Realization of a random variable

Another realization

Samples of ki at a few points

Estimation – i. e. interpolation (or kriging) Calculate the conditional mean

“Geostatistically-based interpolation” (i. e. kriging)

Note the difference between “estimation” and “simulation” Just one of these This Many of these

Note the difference between “estimation” and “simulation” Just one of these This is the average of all of these i. e. • best linear unbiased estimator of k • expected value of k Many of theseestimate of k • minimum error variance

In summary What treating earth properties as (regionalized) random variables allows us to do: • Generate realizations of these properties • Condition these realizations by point measurements of these properties • Interpolate between measurement points to obtain estimates of properties between them that are of minimized error variance • Quantify the uncertainties of these estimates Assumptions • Spatial correlation is a function only of distance between points, and possibly direction (“stationarity”) • k is multi. Gaussian

Some variants of kriging Kriging type What it does ordinary Best linear unbiased interpolator; does not require knowledge of area -wide mean value simple Used if we know the mean value everywhere kriging with trend Estimates a small number of trend parameters as it interpolates cokriging Estimates of one variable are conditioned by point measurements of that same variable, and by point or widespread measurements of another variable with which it is correlated indicator kriging Estimates expected values of a categorical regionalized variable based on point samples of this variable

Stochastic Characterization of Regionalized Random Variables

Characterization of media properties – Option 1 Covariance vs separation C(h) Mean value h

Characterization of media properties – Option 2 Semivariogram γ(h) [Mean value] γ(h) = ½E{[k(x+h) – k(x)]2} h

Characterization of media properties – Option 2 Semivariogram range γ(h) sill [Mean value] nugget γ(h) = ½E{[k(x+h) – k(x)]2} h

Relationship between semivariogram and spatial covariance function γ(h) C(h) h C(0) h γ(h) = C(0) – C(h)

Relationship between semivariogram and spatial covariance function range nugget range γ(h) sill C(h) C(0) = sill nugget h h γ(h) = C(0) – C(h)

Constructing a variogram γ(h) h γ(h) = ½E{[k(x+h) – k(x)]2} Sites where we have sampled k

Constructing a variogram γ(h) h γ(h) = ½E{[k(x+h) – k(x)]2}

Situation in groundwater modelling γ(h) h In groundwater modelling, we often just have to guess the properties of the semi-variogram using common sense as our guide. Note: for kriging, the sill does not matter. • Very few points at which hydraulic properties are measured • Use of these measurements in a groundwater model is often questionable

What functions can we use to represent these? γ(h) C(h) h h A crucial condition: the covariance matrix must be positive definite

Spherical Semivariograms: sill=1. 0 and a=1. 0 in all cases exponential power Gaussian

How to Do It – Some Options

See…

Some Programs from the PEST Suite Program What it does PLPROC Interpolates from pilot points to structured/unstructured grid using kriging FIELDGEN Generates cell-by-cell stochastic property field (optionally conditioned by known values at certain locations) for structured MODFLOW grids PPCOV Calculates a covariance matrix for 2 D pilot points from a two-dimensional variogram PPCOV_SVA Calculates a covariance matrix for 2 D pilot points from a variogram whose properties vary in space PPCOV 3 D Calculates a covariance matrix for 3 D pilot points from a 3 D variogram PPCOV 3 D_SVA Calculates a covariance matrix for 3 D pilot points from a 3 D variogram whose properties vary in space RANDPAR 1 RANDPAR 2 RANDPAR 3 RANDPAR 4 Generates realizations of random variables based on a mean values and PPCOV*-generated variograms

End of Part 1