Hybrid Methods Daryl Kleist Univ of MarylandCollege Park
Hybrid Methods Daryl Kleist Univ. of Maryland-College Park, Dept. of Atmos. & Oceanic Science JCSDA Summer Colloquium on Data Assimilation Fort Collins, CO 27 July – 7 August 2015 Thanks to Kayo Ide (UMD), Jeff Whitaker (NOAA/ESRL), David Parrish (NOAA/NCEP), Steve Penny (UMD), and Matt Kretschmer (UMD) for inspiration, collaboration, and several slides. 1
Outline I. II. Motivation for Hybrid Covariance Hybrid and En. Var Formulation I. III. Single Observations Toy Model Results Extension to 4 D III. Real World Application and Examples from NCEP GFS IV. Hybrid Alternatives I. II. V. Gain-Hybrid Ensemble-Based Hybrid (Ca. LETKF) Summary 2
Perspectives of Data Assimilation • Two main perspectives of practical data assimilation & hybrid approach • What are some of the pros/cons for the ML and MV approaches? Variational Approach: Least square estimation [maximum likelihood] Sequential (KF) Approach: Minimum Variance estimate [least uncertainty] – En. KF (or RRKF) – Variational p(x) 3
3 DVar vs En. KF Single Observation GSI Example xl position 3 DVAR En. KF B determines the quality of Δxa Single 850 mb Tv observation (1 K O-F, 1 K error) 4
Kalman Filter from Var Perspective Forecast Step E/En KF Analysis • Recast the problem in terms of variational framework (cost function) • BKF: Time evolving background error covariance • AKF: Inverse Hessian of JKF(x’) 5
Use of Be in Var • Or, substitute ensemble estimate of error covariance instead • This is in the full physical space, which we can work around by introducing a new control variable: • Where a is the local weight for the ensemble members • L is the localization on the extended control variable • xe are the ensemble perturbations that represent Be (as in En. KF)
Be and Bc Hybridization • We have already demonstrated that Be is powerful for providing flowdependent estimates of the background error covariance (and multivariate correlations) – However, suffers from severe rank deficiency • Alternatively, Bc is full-rank (in the space of the entire model state) – However, typically taken to be static in time, derived from climatological (and usually averaged in time) statistics – Does not often represent multivariate correlations well (i. e. linking humidity to wind) • So, why not try to combine them (Hamill and Snyder, 2000)
Single Observation Hamill and Snyder 100% Bc 90% Be , 10% Bc 8
Toy Model Demo (L 96) 3 DVAR Result, 50% obs • Left shows “NMC” derived Bc , Right is a snapshot of truth, background, analysis, and observations at cycle 500 for 3 DVAR configuration • 3 DVAR does well in observed regions, but struggles in unobserved do to Bc 9
Toy Model Demo (L 96) ETKF Result, 50% obs, M=20 • Left shows snapshot of ETKF derived Be , Right is a snapshot of truth, background, analysis, and observations at cycle 500 for LETKF configuration • ETKF does well everywhere, time evolving B 10
Toy Model Demo (L 96) Hybrid Result, 50% obs, M=20 • Left shows snapshot of Bh , Right is a snapshot of truth, background, analysis, and observations at cycle 500 for hybrid configuration • Hybrid does much better than 3 DVAR, comparable to LETKF 11
Toy Model Demo (L 96) Hybrid Result, 50% obs, M=20 – Hybrid (small beta) as good as ETKF – Hybrid (larger beta) in between 3 DVAR and ETKF 3 DVAR b = 0. 3 b = 0. 7 ETKF Analysis RMSE (x 10) over 1800 cases b 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 3 DVAR 12. 08 Hybrid 3. 321 3. 764 4. 074 4. 633 5. 060 5. 770 7. 044 8. 218 9. 595 ETKF 3. 871 – Small beta hybrid better than LETKF 12
Toy Model Demo (L 96) Hybrid Result, 50% obs, M=10 – Hybrid can help mitigate small ensemble size (like localization) – Not shown: Hybrid with localization would be even better yet 3 DVAR b = 0. 3 b = 0. 7 ETKF LETKF Analysis RMSE (x 10) over 1800 cases b 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 3 DVAR 12. 08 Hybrid 23. 65 10. 64 7. 077 7. 270 7. 930 7. 434 7. 426 8. 826 9. 809 ETKF 42. 37 LETKF 3. 481 13
Hybrid En. Var • Starting from the En. Var cost function, how could we combine the static and ensemble components? • Solution: Add a second background term (one for ensemble, and one for static). Here, we’ll drop the k subscript to help differentiate between climatological (c) and ensemble (e) contributions 14
Hybrid En. Var Lorenc (2003), Buehner (2005), Wang et al. (2007) bc & be: weighting coefficients for clim. (var) and ensemble covariance respectively xt’: (total increment) sum of increment from fixed/static B (xc’) and ensemble B ak: extended control variable; : ensemble perturbations - analogous to the weights in the LETKF formulation L: correlation matrix [effectively the localization of ensemble perturbations]
Preconditioning Sidebar For the double Conjugate Gradient (GSI default), inverses of B and L not need and the solution is pre-conditioned by full B. This formulation differs from the UKMO and Canadians, who use a square root formulation. Also, the weights can be applied to the increments themselves: 16
Single Temperature Observation 3 DVAR En. Var Hybrid bf-1=0. 0 bbf-1 -1=0. 5 f =0. 5 17
Single Observation TC Example 3 DVAR En. Var Hybrid Single 850 mb zonal wind observation (3 m/s O-F, 1 m/s error) in Hurricane Ike circulation
Why Hybrid? VAR (3 D, En. KF 4 D) x Benefit from use of flow dependent ensemble covariance instead of static B Robust for small ensemble * Better localization (physical space) for integrated measure, e. g. satellite radiance Hybrid References x Hamill and Snyder 2000; Wang et al. 2007 b, 2008 ab, 2009 b, Wang 2011; Buehner et al. 2010 ab x Wang et al. 2007 b, 2009 b; Buehner et al. 2010 b x Campbell et al. 2009 Easy framework to add various constraints x x Kleist and Ide 2015 Use of various existing capabilities in VAR x x Kleist and Ide 2015 19
So what’s the catch? • Most configurations of hybrid DA systems require the development and maintenance of two DA systems • En. KF + Var • Still need to deal with localization and other sampling -related issues (though somewhat mitigated by use of full rank Bc) • Even more parameters to explore • Trade off between ensemble size, resolution, hybrid weights, etc. 20
NCEP operational 3 D En. Var Hybrid • Cycling 80 member En. KF (T 574 ~40 km) provides an ensemble-based estimate of Jb term in 3 DVar. • 3 DVar with ensemble Jb updates a T 1534 (~13 km) control forecast. The En. KF analysis ensemble is re-centered around the high-res analysis. • A combination of multiplicative inflation and stochastic physics is used to represent missing sources of uncertainty in the En. KF ensemble. 21
Operational Configuration • Full B preconditioned double conjugate gradient minimization • Spectral filter for horizontal part of L, level-dependent decorrelation distances • Recursive filter used for vertical • 0. 5 scale heights • Same localization used in Hybrid (L) and En. SRF • Applied using GC compact functions • TLNMC (Kleist et al. 2009) applied to total analysis increment* 22 22
member 1 forecast Generate new ensemble perturbations given the latest set of observations and first-guess ensemble member 2 forecast En. KF member update member N forecast Ensemble contribution to background error covariance Previous Cycle T 1534 L 64 member 1 analysis member 2 analysis member N analysis Replace the En. KF ensemble mean analysis and inflate GSI Hybrid En. Var high res forecast recenter analysis ensemble T 574 L 64 Dual-Res Coupled Hybrid Var/En. KF Cycling high res analysis Current Update Cycle 23
Hybrid Impact in Pre-implementation Tests Figure 01: Percent change in root mean square error from the experimental GFS minus the operational GFS for the period covering 01 February 2012 through 15 May 2012 in the northern hemisphere (green), southern hemisphere (blue), and tropics (red) for selected variables as a function of forecast lead time. The forecast variables include 1000 h. Pa geopotential height (a, b), 500 h. Pa geopotential eight (c, d), 200 h. Pa vector wind (e, f, h), and 850 vector 24 wind (g). All verification is performed using self-analysis. The error bars represent the 95% confidence threshold for a significance test.
Hybrid Impact in Pre-implementation Tests Figure 02: Mean tropical cyclone track errors (nautical miles) covering the 2010 and 2011 hurricane seasons for the operational GFS (black) and experimental GFS including hybrid data assimilation (red) for the a) Atlantic basin, b) eastern Pacific basin, and c) western Pacific basin. The number of cases is specified by the blue numbers along the abscissa. Error bars indicate the 5 th and 95 th percentiles of a resampled block bootstrap distribution. 25
4 D Ensemble Var (Liu et al, 2008) GSI - Hybrid 4 D-En. Var Wang and Lei (2014); Kleist and Ide (2015) The Hybrid En. Var cost function can be easily extended to 4 D and include a static contribution (ignore preconditioning) Jo term divided into observation “bins” as in 4 DVAR Where the 4 D increment is prescribed through linear combinations of the 4 D ensemble perturbations plus static contribution, i. e. it is not itself a model trajectory Here, static contribution is time invariant. C represents TLNMC balance operator. No TL/AD in Jo term (M and MT) 26
Ensemble-Var methods: nomenclature Lorenc (2013) • En-4 DVar: Propagate ensemble Pb from one assimilation window to the next (updated using En. KF for example), replace static Pb with ensemble estimate of Pb at start of 4 DVar window, Pb propagated with tangent linear model within window. • 4 D-En. Var: Pb at every time in the assimilation window comes from ensemble estimate (TLM no longer used). • As above, with hybrid in name: Pb is a linear combination of static and ensemble components. • 3 D-En. Var: same as 4 D ensemble Var, but Pb is assumed to be constant through the assimilation window (current NCEP implementation). 27
GSI – Hybrid En-4 DVar Wang and Lei (2014); Kleist and Ide (2015) The traditional 4 DVar cost function can be manipulated to use an ensemble to help prescribe the error covariance at the beginning of the window Jo term divided into observation “bins” as in 4 DVAR Here, the hybrid error covariance is applied at the beginning of the window, and the TL/AD propagate within observation window (M and MT) in Jo term 28
4 DVAR 4 D analysis increment is a trajectory of the PF model. Lorenc & Payne 2007 29
4 D En. Var 4 D analysis is a (localised) linear combination of nonlinear trajectories. It is not itself a trajectory. Courtesy: Andrew Lorenc 30
4 D Hybrids In the alpha control variable method one uses the ensemble perturbations to estimate Pb only at the start of the 4 DVar assimilation window: the evolution of Pb inside the window is due to the tangent linear dynamics (Pb(t) ≈ MPb. MT) In 4 D-En. Var Pb is sampled from ensemble trajectories throughout the assimilation window (nonlinear dynamics): from: D. Barker
Single Observation (-3 h) Example for 4 D Variants 4 DVAR 4 DEn. Var H-4 DVAR_AD bf-1=0. 25 H-4 DEn. Var bf-1=0. 25 32
Time Evolution of Increment t=-3 h Solution at beginning of window same to within round-off (because observation is taken at that time, and same weighting parameters used) t=0 h Evolution of increment qualitatively similar between dynamic and ensemble specification t=+3 h H-4 DVAR_AD H-4 DEn. Var 33
4 D Hybrid Summary • 4 D En. Var analysis is localized, linear combination of ensemble perturbations (similar to En. FK/LETKF) • Traditional 4 DVar (and hybrid 4 DVars) requires sequential (and repeated) runs of the TL/AD. Ensemble trajectories can be pre-computed in parallel (but stored: IO/memory) • Developing/maintaining TL/AD is demanding • Still unsolved issues: Is ensemble sampling of nonlinear dynamics better than TL evolution? Other ensemble-related issues in En. Var… 34
4 D Hybrid at Major NWP Centers • Hybrid 4 D En. Var – Implemented at CMC (Canada) • Replaced 4 DVAR – To be implemented at NCEP (early 2016) • Hybrid En-4 DVAR (Operational or in Testing) – – – UKMO ECMWF* Meteo-France* US Navgem JMA 35
(Planned) Implementation of 4 D En. Var at NCEP • Hybrid 4 D En. Var to become operational for GFS/GDAS by January 2016 (tentative) – Tests at low resolution helped design configuration (results not show, publication in prep) – Real time and retrospectives already underway, operational package quasi-frozen • Package Configuration – T 1534 deterministic GFS with 80 member T 574 L 46 ensemble with fully coupled (two-way) En. KF update (87. 5% ensemble & 12. 5% static), same localization as current operations – Incremental normal mode initialization (TLNMC) on total increment – Multiplicative inflation and stochastic physics for En. KF perturbations – Full field digital filter – All-sky radiance assimilation, aircraft temperature bias correction 36 – Minor model changes
Full Resolution (T 1534/T 574) Trials: 500 h. Pa AC for the Operational GFS (Black, 3 D Hybrid) and Test 4 D configuration (Red) for the period covering 0201 -2015 through 04 -29 -2015.
Full Resolution (T 1534/T 574) Trials: Summary Scorecard (02 -01 through 04 -29 2015)
Note on Hybrid at ECMWF and Meteo-France • ECMWF and Meteo-France utilized ensembles of data assimilation to estimate background errors • They do not employ the extended control variable methods, but instead look to prescribe aspects of the B from the flowdependent ensemble – This includes significant efforts on filtering of raw statistics since they use a 25 member ensemble (next slide). – Filtered Variances Used since 2012 – Filtered Correlations (in their wavelet Jb) used since 2013 39
ECMWF EDA Raw EDA St. Dev Vorticity 500 h. Pa Course 2014 - EDA Filtered EDA St. Dev Vorticity 500 h. Pa c From Bonavita
Flow dependent Jb: Correlations Lengths of vorticity errors, ~500 mb Online Wavelet Jb: 2012011000 Static Wavelet Jb Courtesy ECMWF
Alternate Implementation: “Gain Hybrid” Ensemble Climo/Static • Combining in a specific way to arrive at: • Which is the basis for the Hybrid-LETKF of Penny (2013) LETKF 3 DVAR HYBRID 42
L 96 Results Ob Count v Ens Size LETKF HYBRID (0. 2) • Hybrid really helps for small ensembles, comparable in skill (slightly worse) for regimes that are well observed and have From Penny (2014) very large ensemble sizes
Hybrid Gain Testing at ECMWF En. KF 4 DVar Hybrid NHem Trop SHem 44
Ensemble-Based Hybrids Kretschmer et al. (2015) • Climatologically Augmented LETKF (ca. LETKF) – Supplement dynamic ensemble with orthogonal eigenvectors derived from Bc (i. e. they do not require model integration!) 45
Summary • Hybrid methods attempt to combine the best aspects of variational and ensemble based DA solvers – They have shown to be robust for small ensemble sizes and in the presence of model error – One draw back is that it requires the maintenance of two DA schemes • There has been some work on “filter-free” or more cost effective alternatives • En. Var schemes have a 4 D extension that does not require the TL/AD – Though, while competitive, it has yet to be shown that hybrid 4 D En. Var can truly be better than hybrid En-4 DVAR (yet). • There alternate hybrid formulations out there such as the hybrid gain and ca. LETKF 46
Summary (cont. ) • Hybrid methods have become a popular scheme for NWP, and are now being extended to other earth system models – How will they fare for coupled DA schemes? • These methods are still fairly “new”, and there is still a lot of work that can be done to improve them – Estimation of weighting parameters/localization, building solvers that can update the ensemble and mean simultaneously, etc. 47
Sampling of References • Buehner, M. , 2005: Ensemble-derived stationary and flow-dependent background-error covariances: Evaluation in a quasioperational setting. Quart. J. Roy. Meteor. Soc. , 131, 1013 -1043. • Buehner, M. , J. Morneau, and C. Charette, 2013: Four-dimensional ensemble-variational data assimilation for global deterministic weather prediction. Nonlin. Processes Geophys. , 20, 669 -682, doi: 10. 5194/npg-20 -669 -2013, 2013. • Clayton A. M, A. C. Lorenc and D. M. Barker, 2013: Operational implementation of a hybrid ensemble/4 D-Var global data assimilation system at the Met Office. Quart. J. Roy. Meteor. Soc. , 139, 1445 -1461. • Hamill, T. H. , and C. Snyder. , 2000. A Hybrid Ensemble Kalman Filter-3 D Variational Analysis Scheme. Mon. Wea. Rev. 128, 2905– 2919. • Kleist, D. T. , and K. Ide, 2015: An OSSE-based Evaluation of Hybrid Variational-Ensemble Data Assimilation for the NCEP GFS, Part I: System Description and 3 D-Hybrid Results, Mon. Wea. Rev. , 143, 433 -451. • Kleist, D. T. , and K. Ide, 2015: An OSSE-based Evaluation of Hybrid Variational-Ensemble Data Assimilation for the NCEP GFS, Part II: 4 D En. Var and Hybrid Variants, Mon. Wea. Rev. , 143, 452 -470. • Kretschmer, M. , B. R. Hunt, and E. Ott, 2015: Data assimilation using climatologically augmented local ensemble transform Kalman Filter, Tellus, 67, 9 pp. • Liu C. , Q. Xiao, and B. Wang, 2008: An ensemble-based four dimensional variational data assimilation scheme. Part I: technique formulation and preliminary test. Mon. Wea. Rev. , 136, 3363 -3373. • Lorenc, A. C. , 2003: The potential of the ensemble Kalman filter for NWP – a comparison with 4 D-VAR. Quart. J. Roy. Meteor. Soc. , 129, 3183 -3203. • Lorenc, A. C. 2013: Recommended nomenclature for En. Var data assimilation methods. Research Activities in Atmospheric and Oceanic Modeling. WGNE, 2 pp, URL: http: //www. wcrp-climate. org/WGNE/Blue. Book/2013/individualarticles/01_Lorenc_Andrew_En. Var_nomenclature. pdf) • Penny, S. , 2014: A hybrid local ensemble transform Kalman filter. Mon. Wea. Rev. 142, 2139 -2149. . 48
Sampling of References • Wang, X. , C. Snyder, and T. M. Hamill, 2007 a: On theoretical equivalence of differently proposed ensemble/3 D-Var hybrid analysis schemes. Mon. Wea. Rev. , 135, 222 -227. • Wang, X. , 2010: Incorporating ensemble covariance in the Gridpoint Statistical Interpolation (GSI) variational minimization: a mathematical framework. Mon. Wea. Rev. , 138, 2990 -2995. • Zhang F. , M. Zhang, and J. Poterjoy, 2013: E 3 DVar: Coupling an ensemble Kalman filter with three-dimensional variational data assimilation in a limited area weather prediction model and comparison to E 4 DVar. Mon. Wea. Rev. , 141, 900– 917. • Zhang, M. and F. Zhang, 2012: E 4 DVar: Coupling an ensemble Kalman filter with four-dimensional variational data assimilation in a limited-area weather prediction model. Mon. Wea. Rev. , 140, 587– 600. 49
- Slides: 49