Extreme Value Theory for High Frequency Financial Data

Extreme Value Theory l Given i. i. d. log returns {r 1, …, rn},

Why High Frequency Data? l l 1. Better Estimation l More data increases availability

Standardization of Data l In order to generate a data set of i. i.

Block Maxima Method l Divide time series data set into n subgroups and determine

Estimated GEV CDF l l l Citigroup (C) 8 minute frequency Maximum Values (Right

Value at Risk l In order to determine the 0. 01% VAR, set P(rt

8 -Minute 0. 01%-VAR l According to our EVT estimation, there is a 0.

Value at Risk for One Day l Given a VAR with a one-day horizon,

One Day 0. 01%-VAR l According to our EVT estimation, there is a 0.

Other VAR Methods l Sliding Window: Use previous 50 day’s worth of data to

Points Over Thresholds l For a sufficiently high threshold u, the distribution function of

Further Research l Different Frequencies l Different Stocks

Slides: 20

Download presentation

Extreme Value Theory for High -Frequency Financial Data Abhinay Sawant April 1, 2009 Economics 201 FS

Extreme Value Theory l Given i. i. d. log returns {r 1, …, rn}, the maximum return converges weakly to the Generalized Extreme Value (GEV) distribution: l Two methods of estimation: (1) Block Maxima, (2) Points Over Thresholds l Applications include financial risk management measures involving tail estimations such as Value at Risk and Expected Shortfall. EVT typically provides better tail estimations compared to other methods, especially for very high quantiles.

Why High Frequency Data? l l 1. Better Estimation l More data increases availability of extreme values l Captures risk from intraday movements l More data from more recent years 2. Intraday Measures l Concepts such as intraday Va. R have been proposed recently in current literature

Standardization of Data l In order to generate a data set of i. i. d. data {x 1, …, xn}, adjust all of the intraday returns by previous day’s daily realized volatility: l Still looked at both standardized and unstandardized data for analysis

Block Maxima Method l Divide time series data set into n subgroups and determine the maximum/minimum of each subgroup: {x 1, …, xt| xt+1, …, x 2 t|…| x(g-1) t+1, …, xtg} l If rn, i is the maximum/minimum of the ith subsample, then xn, I should follow a GEV distribution for sufficiently large n. l Maximum likelihood estimation through the built-in MATLAB function “gevfit” is used to determine values of the parameters ξ, α, and β

Estimated GEV CDF l l l Citigroup (C) 8 minute frequency Maximum Values (Right Tail) Standardized Data n = 48 (one day) l Estimates: l ξ = 0. 1198 l α = 0. 0012 l β = 0. 0013

Residuals From GEV CDF

Estimated GEV PDF

Shape Parameter

Scaling Parameter

Location Parameter

Value at Risk l In order to determine the 0. 01% VAR, set P(rt < rn*) = 0. 999. The value rn* is the Value at Risk and can be determined from one of two ways:

8 -Minute 0. 01%-VAR

8 -Minute 0. 01%-VAR l According to our EVT estimation, there is a 0. 01% chance that we’ll see an 8 -minute standardized log return of 0. 00726 or higher with sample standard deviation of 0. 00004. l Empirically, we find that about 0. 0105% of standardized 8 -minute returns are above 0. 00726.

Value at Risk for One Day l Given a VAR with a one-day horizon, the VAR with a time-horizon of L days can be calculated as follows: l Since there are 48 8 -minute returns in a day, I multiplied my 8 -minute VAR by (48)^(ξ) to determine a 1 -day VAR.

One Day 0. 01%-VAR

One Day 0. 01%-VAR l According to our EVT estimation, there is a 0. 01% chance that we’ll see an 8 -minute standardized log return of 0. 0198 or higher with sample standard deviation of 0. 0014 l Empirically, we find that about 2. 32% of standardized daily log returns are above 0. 0198. Empirically, about 0. 01% of the data are above 0. 0260.

Other VAR Methods l Sliding Window: Use previous 50 day’s worth of data to determine VAR for 51 st day and calculate the number of exceedances, in order to determine the effectiveness of VAR model.

Points Over Thresholds l For a sufficiently high threshold u, the distribution function of the excesses (X – u) may be approximated by a Generalized Pareto Distribution: l The pth quantile can be computed from the parameters:

Further Research l Different Frequencies l Different Stocks