Hybrid variationalensemble data assimilation at NCEP Daryl T
Hybrid variational-ensemble data assimilation at NCEP Daryl T. Kleist daryl. kleist@noaa. gov National Monsoon Mission Scoping Workshop IITM, Pune, India 11 -15 April 2011 1
Variational Data Assimilation J : Penalty (Fit to background + Fit to observations + Constraints) x’ : Analysis increment (xa – xb) ; where xb is a background BVar : Background error covariance H : Observations (forward) operator R : Observation error covariance (Instrument + representativeness) yo’ : Observation innovations Jc : Constraints (physical quantities, balance/noise, etc. ) B is typically static and estimated a-priori/offline 2
Motivation from Var • Current background error covariance (applied operationally) in VAR – Isotropic recursive filters – Poor handle on cross-variable covariance – Minimal flow-dependence added • Implicit flow-dependence through linearization in normal mode constraint (Kleist et al. 2009) • Flow-dependent variances (only for wind, temperature, and pressure) based on background tendencies – Tuned NMC-based estimate (lagged forecast pairs) 3
Current background error for GFS • Although flow-dependent variances are used, confined to be a rescaling of fixed estimate based on time tendencies – – • No multivariate or length scale information used Does not necessarily capture ‘errors of the day’ Plots valid 00 UTC 12 September 2008 4
Kalman Filter in Var Setting Forecast Step Analysis Extended Kalman Filter • Analysis step in variational framework (cost function) • BKF: Time evolving background error covariance • AKF: Inverse [Hessian of JKF(x’)] 5
Motivation from KF • Problem: dimensions of AKF and BKF are huge, making this practically impossible for large systems (GFS for example). • Solution: sample and update using an ensemble instead of evolving AKF/BKF explicitly Forecast Step: Ensemble Perturbations Analysis Step: 6
Why Hybrid? VAR En. KF Hybrid References (3 D, 4 D) Benefit from use of flow dependent ensemble covariance instead of static B x Hamill and Snyder 2000; Wang et al. 2007 b, 2008 ab, 2009 b, Wang 2011; Buehner et al. 2010 ab Robust for small ensemble x Wang et al. 2007 b, 2009 b; Buehner et al. 2010 b Better localization for integrated measure, e. g. satellite radiance x Campbell et al. 2009 x Easy framework to add various constraints x x Framework to treat non. Gaussianity x x Use of various existing capabilities in VAR x x 7
Hybrid Variational-Ensemble • Incorporate ensemble perturbations directly into variational cost function through extended control variable – Lorenc (2003), Buehner (2005), Wang et. al. (2007), etc. bf & be: weighting coefficients for fixed and ensemble covariance respectively xt: (total increment) sum of increment from fixed/static B (xf) and ensemble B ak: extended control variable; : ensemble perturbation L: correlation matrix [localization on ensemble perturbations] 8
Experiments with toy model • Lorenz ‘ 96 – 40 variable model, F=8. 0, dt=0. 025 (“ 3 hours”) – 4 th order Runge-Kutta • OSSE: observations generated from truth run every 2*dt (“ 6 hours”) – [N(0, 1)] • Experimental design – Assimilate single time level observations every 6 hours, at appropriate time, R=1. 0 9 – F=7. 8 (imperfect model) for DA runs
Analysis Error (50% observation coverage) 3 DVAR bf = 0. 7 bf = 0. 3 ETKF • M (ensemble size) = 20, r (inflation factor) = 1. 1 – Hybrid (small alpha) as good as/better than ETKF (faster spinup) – Hybrid (larger alpha) in between 3 DVAR and ETKF 10
Sensitivity to b Analysis RMSE (x 10) over 1800 cases bf 3 DVAR 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 12. 08 Hybrid 3. 321 3. 764 4. 074 4. 633 5. 060 5. 770 7. 044 8. 218 9. 595 ETKF 3. 871 • 50% observation coverage (M = 20, r = 1. 1) – Improvement a near linear function of weighting parameter • Small enough weighting (on static error estimate) improves upon ETKF 11
Importance of Ensemble Generation Method? • GEFS (already operational) – 80 cycled members – ETR • Virtually no computational cost • Uses analysis error mask derived for 500 mb streamfunction • Tuned for medium range forecast spread and fast “error growth” – T 190 L 28 version of the GFS model – Viable for hybrid paradigm? • En. KF – 80 cycled members – Perturbations specifically designed to represent analysis and background errors – T 254 L 64 version of the GFS – Extra computational costs worth it for hybrid? 12
En. KF/ETR Comparison En. KF (green) versus ETR (red) spread/standard deviation for surface pressure (mb) valid 2010101312 Surface Pressure spread normalized difference (ETR has much less spread, except poleward of 70 N) 13
En. KF/ETR Comparison En. KF zonal wind (m/s) ensemble standard deviation valid 2010101312 ETR zonal wind (m/s) ensemble standard deviation valid 2010101312 14
En. KF/NMC B Compare En. KF zonal wind (m/s) ensemble standard deviation valid 2010101312 Zonal Wind standard deviation (m/s) from “NMC-method” 15
Hybrid with (global) GSI • Control variable has been implemented into GSI 3 DVAR* – Full B preconditioning • Working on extensions to B 1/2 preconditioned minimization options – Spectral filter for horizontal part of A • Eventually replace with (anisotropic) recursive filters – Recursive filter used for vertical – Dual resolution capability • Ensemble can be from different resolution than background/analysis (vertical levels are the exception) – Various localization options for A • Grid units or scale height • Level dependent (plans to expand) – Option to apply TLNMC (Kleist et al. 2009) to analysis increment *Acknowledgement: Dave Parrish for original implementation of extended control variable 16
Single Observation 17 Single 850 mb Tv observation (1 K O-F, 1 K error)
Single Observation 18 Single 850 mb zonal wind observation (3 m/s O-F, 1 m/s error) in Hurricane Ike circulation
Dual-Res Coupled Hybrid member 2 forecast recenter analysis ensemble member 1 forecast En. KF member update member 3 forecast high res forecast Previous Cycle GSI Hybrid Ens/Var high res analysis Current Update Cycle member 1 analysis member 2 analysis member 3 analysis
Hybrid Var-En. KF GFS experiment • Model – GFS deterministic (T 574 L 64; post July 2010 version – current operational version) – GFS ensemble (T 254 L 64) • 80 ensemble members, En. KF update, GSI for observation operators • Observations – All operationally available observations (including radiances) – Includes early (GFS) and late (GDAS/cycled) cycles as in production • Dual-resolution/Coupled • High resolution control/deterministic component – Includes TC Relocation on guess • Ensemble is recentered every cycle about hybrid analysis – Discard ensemble mean analysis • Satellite bias corrections – Coefficients come from GSI/VAR • Parameter settings • 1/3 static B, 2/3 ensemble • Fixed localization: 800 km & 1. 5 scale heights • Test Period – 15 July 2010 – 15 October 2010 (first two weeks ignored for “spin-up”) 20
500 h. Pa Anom. Corr. Northern Hemisphere Southern Hemisphere 21
AC Frequency Distributions Northern Hemisphere Southern Hemisphere 22
Geopotential Height RMSE Northern Hemisphere Southern Hemisphere Significant reduction in mean height errors 23
Stratospheric Fits Improved fits to stratospheric observations 24
Forecast Fits to Obs (Tropical Winds) Forecasts from hybrid analyses fit observation much better. 25
26
27
28
HVEDAS (3 D) for GDAS/GFS • Prototype dual-resolution, two-way coupled hybrid Var/En. KF system outperforms standard 3 DVAR in GFS experiments – 2010 Hurricane Season (August 15 through October 31 2010) run offsite – Emphasis on AC, RMSE, TC Tracks • Plan underway to implement into GDAS/GFS operationally – Target: Spring 2012 (subject to many potential issues) • Porting of codes/scripts back to IBM P 6 • Cost analysis (will everything fit in production suite? ) • More thorough (pre-implementation) testing and evaluation – More test periods (including NH winter) – Other/more verification metrics • Potential moratorium associated with move to new NCEP facility • Issues – Weighting between ensemble and static B – Localization – How should En. KF be used within ensemble forecasting paradigm? 29
Cost • Analysis/GSI side – Minimal additional cost • Reading in ensemble (3 -9 hour forecast from previous cycle) – Working on building an ensemble post-processor to prep files for GSI • Coding already complete/in place – Optimize localization (? ) • Additional “GDAS” Ensemble (T 254 L 64 GDAS) – En. KF-based perturbation update • Cost comparable to current analysis [say 8 -10 nodes, something <40 minutes] – Includes ensemble of GSI runs to get O-F and actual ensemble update step • Work ongoing to optimize coding and scripting – 9 hr forecasts, needs to be done only in time for next (not current) cycle 30
HVEDAS Extensions and Improvements • Expand hybrid to 4 D – Hybrid within ‘traditional 4 DVAR’ (with adjoint) – Pure ensemble 4 DVAR (non-adjoint) – Ensemble 4 DVAR with static B supplement (non-adjoint)* • En. KF improvements – Explore alternatives such as LETKF – Adaptive localization and inflation • Non-GFS applications in development – – – Other global models (NASA GEOS-5, NOAA FIM) NAM /Mesoscale Modeling Hurricanes/HWRF Storm-scale initialization Rapid Refresh • NCEP strives to have single DA system to develop, maintain, and run operationally (global, mesoscale, severe weather, hurricanes, etc. ) – GSI (including hybrid development) is community code supported through DTC – En. KF used for GFS-based hybrid being expanded for use with other applications 31
- Slides: 31