Comparison of methods to model aboveground biomass for
Comparison of methods to model aboveground biomass for derivation of + 20 year trajectories Powell, S. L 1, Kennedy, R. E. 2, Healey, S. P. 3, Pierce, K. B. 4, Cohen, W. B. 1, Moisen, G. G. 3, Ohmann, J. L. 1 1 Introduction and Background U. S. D. A. Forest Service, Pacific Northwest Research Station, Corvallis, OR 97331 2 Dept. of Forest Science, Oregon State University, Corvallis, OR 97331 3 U. S. D. A. Forest Service, Rocky Mountain Research Station, Ogden, UT 84401 4 WA Dept. of Fish and Wildlife, Olympia, WA 98501 FIA data development Ø North American Carbon Program (NACP) and Forest Inventory and Analysis (FIA) goals: Ø Improve methods for quantification of forest disturbance and regrowth processes (Goward et al. , 2008). Ø Characterize forest disturbance and regrowth dynamics by integrating FIA data and Landsat time-series data to model aboveground forest biomass dynamics. Ø Many approaches for empirical modeling of biomass using optical remote sensing data, but little consensus on best method. Ø Little is known about multi-temporal prediction of biomass. Ø Three study objectives: 1. Evaluate a suite of six statistical techniques for modeling biomass. Ø Reduced Major Axis (RMA) regression Ø Orthogonal regression technique. Ø Maintains data variance structure. Ø Generalized Additive Models (GAMs) Ø Generalization of multiple regression. Ø Useful for detecting non-linear relationships. Ø Gradient Nearest Neighbor (GNN) imputation Ø Assigns plot attributes to each unmapped pixel based upon a multivariate distance in gradient space. Ø Maintains co-variance structure among response variables. Ø Cubist Ø Regression tree technique. Ø Fits a multivariate linear model to each rule. Ø Stochastic Gradient Boosting (SGB) Ø Refinement of traditional regression tree analysis. Ø Potentially less sensitive to outliers or unbalanced data and more resistant to over-fitting. Ø Random Forests (RF) Ø Ensemble regression tree approach. Ø Constructs numerous small regression trees from which predictions are averaged. 2. Evaluate the effect of inclusion of ancillary predictor variables. Ø Spectral variables, derived spectral indices, topographically-derived variables, and climate variables. 3. Assess how the choices of model/predictor variables affect predictions of biomass change. Ø Limitations to empirical modeling of biomass using optical remote sensing data are well known. Ø Solution hinges upon leveraging the temporal information contained within the Landsat time-series (Kennedy et al. , 2007). Ø Trends across 20+ year trajectories of biomass are noteworthy, especially in instances of forest disturbance or regrowth. Ø Smoothing biomass trajectories with a curve-fitting algorithm enables a more accurate evaluation of biomass dynamics (Figure 1). Ø The algorithm, Landsat Detection of Trends in Disturbance and Recovery (Lan. DTren. DR) uses segmentation rules to summarize the temporal progression of any spectral or derived spectral index (e. g. biomass) of Landsat imagery (Kennedy et al, in prep. ). Ø Captures both abrupt and slow disturbance as well as diverse sequences of disturbance and regrowth. Ø Plot-level estimates of live aboveground tree biomass developed for four inventories (Table 2). Ø Only used homogenous “single condition plots”. Ø Calculated biomass change for a subset of plots that were remeasured between successive inventories. Ø Summary biomass statistics for the model building inventories shown in Table 3. State Meas. Yrs. Inventory Type Use # Plots AZ 1995 Remeasure Validation 36 2001 -2004 Annual Cycle 3 Model 136 1998 -2003 Annual Cycle 12 Model 978 2003 -2005 Annual Cycle 13 Remeasure Validation 285 MN Periodic Cycle 2 Table 2. Summary of field measurement years, inventory type, how the data were used, and number of plots for each of the four FIA inventories used in this study. Stat. n Mean Stdev Min Max AZ 136 47 59 0 320 MN 978 64 46 0 230 Empirical Modeling Ø Data divided into model (2/3) and validation (1/3) sets. Ø Empirical biomass models developed for each of six modeling techniques and eight variable permutations. Ø Eight variable permutations based on four categories of predictor variables (Table 4). 1. RAW: Raw Landsat bands only (1) 2. RAW. SPECT: Raw Landsat bands + Derived spectral indices (1 + 2) 3. RAW. 30: Raw Landsat bands + Derived spectral variables + Topographic variables (1 + 2 + 3) 4. RAW. ALL: Raw Landsat + Derived spectral indices + Topographic variables + Climate Variables (1 + 2 + 3 + 4) 5. TC: Tasseled Cap bands only (Brightness, Greenness, Wetness) 6. TC. SPECT: TC bands + Derived spectral indices (TC Bands + 2) 7. TC. 30: TC bands + Derived spectral indices + Topographic variables (TC Bands + 2 + 3) 8. TC. ALL: TC bands + Derived spectral indices + Topographic variables + Climate Variables (TC Bands + 2 + 3 + 4) 1. Raw Landsat Bands (28. 5 m) 2. Derived Spectral Indices (28. 5 m) 3. Topographic Variables (28. 5 m) 4. Climate Variables (1 km) Band 1 NDVI Elevation Temperature Band 2 Tasseled-cap Brightness Slope Growing Degree Days Band 3 Tasseled-cap Greenness Potential Relative Radiation Water Vapor Pressure Band 4 Tasseled-cap Wetness Precipitation Band 5 Tasseled-cap Angle (TCG/TCB) Shortwave Radiation Band 7 Tasseled-cap Distance (sqrt(TCB 2+TCG 2)) Disturbance Index Figure 1. Sample biomass trajectories demonstrating the effect of curve-fitting (solid lines) on raw biomass predictions (dashed lines) for two recently disturbed FIA plots (plot-level biomass observations shown as stars). Figure 5. Variance ratio by sample scene, model type, and permutation. Table 3. Biomass summary statistics for AZ Cycle 3 and MN Cycle 12 inventories. Table 4. Four groups of predictor variables used to construct the eight predictor variable permutations for empirical modeling. Climate variables were 18 -year mean annual values derived from DAYMET (Thornton et al. , 1997). Disturbance Index from Healey et al. , 2005. Potential Relative Radiation from Pierce et al. , 2005. Figure 6. Bias by sample scene, model type, and permutation. Ø “Best” permutation for each model type determined by ranking RMSE, variance ratio, bias, and R 2 (Table 5). Ø Arizona favored less complex variable permutations (except for certain regression tree methods). Ø Likely due to stronger empirical relationships between biomass and Landsat spectral data compared to Minnesota data. Ø Compared the “best” permutation across model types using same four criteria. Ø Regression tree methods yielded the lowest error for both AZ and MN. Ø GNN best preserved variance in Arizona. Ø RMA best preserved variance in Minnesota. Ø RMA and SGB were the least biased in Arizona. Ø GNN was the least biased in Minnesota. Ø Overall, RF was the best model type for optimizing all four criteria, and was superior to other methods at minimizing prediction error. AZ – Pre-Fit MN RMA TC TC. SPECT GAMS TC. SPECT TC. ALL GNN TC RAW. ALL CUBIST RAW. ALL TC. ALL SGB TC TC. ALL RF TC. ALL RAW. ALL MN – Pre-Fit AZ – Post-Fit Ø Biomass predictions for each model type/permutation validated in terms of RMSE, variance ratio, bias, and observed v. predicted R 2. Ø Performance of each permutation ranked to determine “best” permutation per model type and per sample scene. Ø RMA, GNN, and TC models selected for creation of 20+ year biomass trajectories that were smoothed with Lan. DTren. DR. Ø Biomass trajectories validated using biomass change data derived from successively remeasured FIA data. Model Type AZ Table 5. “Best” variable permutation for each model type and sample scene. MN – Post-Fit RMA Results and Discussion Landsat data development Methods Ø Biennial Landsat time-series stacks acquired for two sample scenes (Table 1). Ø Arizona and Minnesota (Figure 2). Ø Geometrically corrected and radiometrically calibrated to surface reflectance with the LEDAPS algorithm (Masek et al. , 2006). Ø Relative radiometric normalization to a common reference image (noted by * in table) using the Multivariate Alteration Detection (MAD) method (Canty et al. , 2004). Table 1. Landsat timeseries stacks for two sample scenes, with normalization reference images noted by *. Figure 2. Arizona (Landsat path/row 37/35) and Minnesota (Landsat path/row 27/27) study scenes. Arizona Minnesota 06/21/1985 06/28/1984 07/26/1986 08/21/1986 07/02/1989 09/14/1989 Ø Accuracy of biomass predictions varied by model type and variable permutation. Ø In terms of RMSE (Figure 4) the differences between model types were generally smaller than the differences between variable permutations within a model type. Ø Regression tree methods (Cubist, SGB, and RF) generally favored more complex variable permutations to achieve the lowest error among model types (Figure 4). Ø Other methods (RMA, GNN, and GAMS) tended towards less complex variable permutations and had slightly higher error rates. Ø In Arizona: Ø All model types and permutations fared poorly with respect to preservation of variance (Figure 5). Ø Nearly all model types and permutations were biased towards under prediction (Figure 6). Ø In Minnesota: Ø All model types and permutations deflated the prediction variance with the exception of RMA (Figure 5). Ø Virtually no differences between model types and permutations with respect to bias (Figure 6). GNN RF Figure 7. Observed biomass change (from remeasured FIA inventories) vs. biomass change predictions pre- and post-fit by Lan. DTren. DR curve-fitting. 06/19/1990 07/31/1990 06/22/1991 08/19/1991 06/27/1993 08/24/1993 06/14/1994 08/14/1995 06/22/1997 09/04/1997 06/25/1998 08/06/1998 06/14/2000* 07/24/1999 06/12/2002 07/05/2001 07/09/2003 09/05/2003* 08/31/2005 08/06/2004 09/03/2006 09/13/2006 Ø For all model types, the error in predicted biomass change was greatly reduced by curve-fitting with Lan. DTren. DR (Figure 7). Ø GNN and RMA exhibited the greatest improvements in Arizona and Minnesota respectively (Figure 8). Ø RF exhibited the least improvement, but was the most accurate model for predicting biomass change. Ø All model types were similarly affected by curve-fitting, with an overall reduction in the range of biomass predictions. AZ Figure 4. RMSE by sample scene, model type, and permutation. References Canty, M. J. , A. A. Nielson, and M. Schmidt. 2004. Automatic radiometric normalization of multitemporal satellite imagery. Remote Sensing of Environment, 91(3 -4): 441 -451. Goward, S. N. , J. G. Masek, W. Cohen, G. Moisen, G. J. Collatz, S. Healey, R. A. Houghton, C. Huang, R. Kennedy, B. Law, S. Powell, D. Turner, and M. A. Wulder. 2008. Forest disturbance and North American carbon flux. EOS, 89(11): 105 -106. Healey, S. P. , W. B. Cohen, Y. Zhiqiang, and O. Krankina. 2005. Comparison of Tasseled Cap-based Landsat data structures forest disturbance detection. Remote Sensing of Environment, 97: 301 -310. Kennedy, R. E. , W. B. Cohen, and T. A. Schroeder. 2007. Trajectory-based change detection for automated characterization of forest disturbance dynamics. Remote Sensing of Environment, 110: 370 -386. Kennedy, R. E. , W. B. Cohen, Y. Zhiqiang, M. Fiorella, E. Pfaff, and M. Melinda. In Prep. Characterizing trends in disturbance and recovery in forests using the Landsat archive. Masek, J. G. , E. F. Vermote, N. Saleous, R. Wolfe, F. G. Hall, F. Huemmrich, F. Gao, J. Kutler, and T. K. Lim. 2006. Landsat surface reflectance data set for North America, 1990 -2000. Geoscience and Remote Sensing Letter, 3: 68 -72. Pierce, K. B. , T. Lookingbill, and D. L. Urban. 2005. A simple method for estimating potential relative radiation (PRR) for landscape-scale vegetation analysis. Landscape Ecology, 20: 137 -147. Thornton, P. E. , S. W. Running, and M. A. White. 1997. Generating surfaces of daily meteorological variables over large regions of complex terrain. Journal of Hydrology, 190: 214 -251. MN Figure 8. Effect of Lan. DTren. DR on biomass change RMSE for AZ (left) and MN (right) model types. Funded provided by NASA’s Carbon Cycle Science Program
- Slides: 1