Hydrological Frequency Analysis Model Selection Criteria Professor Kesheng





















- Slides: 21
Hydrological Frequency Analysis Model Selection Criteria Professor Ke-sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University
Selection of the best-fit distribution • Methods of model selection based on loss of information. • • Akaike information criterion (AIC) Schwarz's Bayesian information criterion (BIC) Hannan-Quinn information criterion (HQIC) Anderson-Darling criterion (ADC) 2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 2
Information-criteria-based model selection where is the log-likelihood function for the parameter associated with the model, n is the sample size, and p is the dimension of the parametric space. 2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 3
2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 4
2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 5
Model selection based on information criteria using R • The ns. RFA package • MSClaio 2008(x) 2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 6
MSClaio 2008 2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 7
• When the sample size, n, is small, with respect to the number of estimated parameters, p, the AIC may perform inadequately. In those cases a secondorder variant of AIC, called AICc, should be used: Indicatively, AICc should be used when (n/p) < 40. 2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 8
2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 9
Rationale of the information criteria • The Akaike information criterion uses the Kullback-Leibler divergence as the discrepancy measure between the true model f(x) and the approximating model g(x). • Information and entropy 2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 10
What is information? • Consider the following statements: • I will eat some food tomorrow. • A major earthquake will strike Taiwan tomorrow. • Which statement conveys more information? 2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 11
Definition of entropy • 侯如真,2001. 訊息熵應用於雨量站網設計之理論探討。國立臺灣大學農業 程學研究所碩 士論文。 2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 12
2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 13
2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 14
Kullback-Leibler Divergence 2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 15
• If there are several candidate distributions, we only need to calculate H(X|qi(X)) since H(X|p(X)) is a constant. • In practical applications, the above term is estimated as (Akaike, 1973) where pj is the number of parameters of the jth model. 2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 16
Divergence • Divergence is a measure of the separability of a pair of distributions that has its basis in their degree of overlap. It is defined in terms of the likelihood ratio where and are respectively the probability density of classes and at the position x. 2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 17
2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 18
2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 19
2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 20
Since divergence is never negative it follows therefore that In other words, divergence never decreases as the number of features is increased. 2/25/2021 Dept. of Bioenvironmental Systems Engineering, National Taiwan University 21