Random Forests and Nearest Neighbors Methods for mapping
Random Forests and Nearest Neighbors: Methods for mapping the West Cascades of Oregon Emilie Grossmann, Oregon State University Janet Ohmann, U. S. Forest Service James Kagan, Oregon State University Kenneth Pierce, U. S. Forest Service Heather May, Oregon State University Matthew Gregory, Oregon State University
The West Cascades Madison The West Cascades
USGS Pacific Northwest Re. GAP • GAP project needs broad-scale, but also detailed vegetation base-maps. • Consistent classification system: Nature. Serve’s Ecological Systems
North Pacific Mesic-Wet Douglas-fir Western Hemlock Forest • This ecological system is a significant component of the lowland low montane forests of western Washington, northwestern Oregon, and southwestern British Columbia. • . . . In Oregon, it occurs on the western slopes of the Cascades, around the margins of the Willamette Valley, and on the west side of the Coast Ranges, and is reduced to locally small patches in southwestern Oregon. . continued
North Pacific Mesic-Wet Douglas-fir Western Hemlock Forest. . . • They differ from North Pacific Maritime Dry. Mesic Douglas-fir-Western Hemlock Forest primarily in having more hydrophilic undergrowth species. . . • In many rather drier climatic areas, it occurs as small to large patches within a matrix of North Pacific Maritime Dry-Mesic Douglas-fir-Western Hemlock Forest; in dry areas, it can occur adjacent to or in a mosaic with North Pacific Dry Douglas-fir Forest and Woodland, and at higher elevations it intermingles with either North Pacific Dry-Mesic Silver Fir-Western Hemlock-Douglasfir Forest or North Pacific Mesic Western Hemlock-Silver Fir Forest.
Can you see the problem? • We need more information than LANDSAT • This is why we need statistics for building the GAP maps. What type of model to use?
Objective • Compare Random Forest (RF) and Gradient Nearest Neighbor (GNN) modeling techniques with respect to: 1) 2) 3) 4) classification accuracy class area representation spatial patterns explanatory variables used
Methods – GNN and RF models built from – 4222 records from our plot database – and mapped explanatory variables, selected from 115 possible layers Landsat Bands, transformations, texture Climate Means, seasonal variability Topography Elevation, slope, aspect, solar Disturbance Past fires, harvest, insects and disease Location X, Y Soil Parent Material e. g. , Ultramafic rocks, sandstone, basalt, etc.
Methods: Random Forest • One Classification Tree. Elevation < 1244 August Maximum < 2560 Temp August Maximum < 2324 Temp Summer Aug. to Dec. Mean < 1279 Temperature < 1279 Temp Differential 4224 4215 Elevation < 1625 4272 4228 LANDSAT Band 7 < 24 4215 4267 North Pacific Dry-Mesic Silver Fir-Western Hemlock-Douglas-fir Forest
Methods: Random Forest • A “Forest” of classification trees. • Each tree is built from a random subset of plots and variables. • When the model is applied to a pixel, each tree ‘votes’ for an Ecological System.
Methods: Adjusting The Random Forest Map • The Random Forest model tends to over-map some systems, and under-map others. • We can map the votes for the under-mapped systems, creating single-system maps. • . . . which can be used to expand their area in the final map.
Methods: Adjusting The Random Forest Map Single System Map of: North Pacific Mesic Western Hemlock-Silver Fir Forest
Methods: Gradient Nearest Neighbor Imputation gradient space CCA Axis 2 (e. g. , Climate) (1) conduct gradient analysis of plot data geographic space (2) calculate axis scores of pixel from mapped data layers CCA Axis 1 (e. g. , elevation, Y) (3) find nearestneighbor plot in gradient space (4) impute nearest neighbor’s value to pixel study area
The Maps Without Landsat TM With Landsat TM RF RF_TM RF_ADJ_TM GNN_TM
Results
RF: RF_ADJ: RF_TM: RF_ADJ_TM: GNN_TM: 0. 34 0. 68 0. 34 0. 70 0. 38 0. 73 0. 38 0. 70 0. 30 0. 63 0. 29 0. 60 Top #: Kappa, Bottom #: Fuzzy Kappa
Best Maps: Class By Class (assigned by Kappa) RF 0 0 RF_Adj 2 2 RF_TM 9 20 RF_Adj_TM GNN_TM 10 9 1 0 3 1 Mediterranean California Mixed Evergreen Forest Mediterranean California Mesic Mixed Conifer Forest and Woodland Northern Rocky Mountain Western Larch Savanna Northern Rocky Mountain Subalpine Woodland Parkland North Pacific Maritime Dry. Mesic Douglas-fir-Western Hemlock Forest East Cascades Mesic Montane Mixed-Conifer Forest and Woodland Sierra Nevada Subalpine Lodgepole Pine Forest and Woodland North Pacific Mesic Western Hemlock-Silver Fir Forest Mediterranean California Dry. Mesic Mixed Conifer Forest and Woodland California Montane Woodland Chaparral Rocky Mountain Lodgepole Pine Forest Mediterranean California Red Fir Forest Rocky Mountain Poor-Site Lodgepole Pine Forest North Pacific Dry Douglas-fir Forest and Woodland North Pacific Dry-Mesic Silver Fir-Western Hemlock -Douglas-fir Forest North Pacific Maritime Mesic. Wet Douglas-fir-Western Hemlock Forest North Pacific Broadleaf Landslide Forest and Shrubland North Pacific Mountain Hemlock Forest North Pacific Wooded Volcanic Flowage Northern California Mesic Subalpine Woodland North Pacific Lowland Mixed Hardwood Conifer Forest and Woodland Northern Rocky Mountain Ponderosa Pine Woodland Savanna Rocky Mountain Subalpine Dry-Mesic Spruce-Fir Forest and Woodland North Pacific Maritime Mesic Subalpine Parkland Mediterranean California Lower Montane Black Oak. Conifer Forest and Woodland
Do Nor ug th P las -fir acifi -W c M es ter aritim n. H e em Dry loc -M Mi Me k F esi xe dit d C err ore c on ane st ife r F an C ore ali st forn N an Do ort d W ia M ug h P oo esic las ac dla -fir ific nd -W M es arit ter im n. H e. M em es loc ic-W k. F ore et st Mo un tai n H No N em rth o We rt loc Pac h ste k F ific rn Paci ore He fic st ml Dry oc M k-D e ou sic S Me gla d ite Mi s-f ilver rra xe ir F Fi d C ne ore ron an st C ife r F alifo ore rn ia st an Dry d W -M Do oo esic ug dla las nd -fir Fo No res rth ta nd Pacif Wo ic D We od ry ste lan rn d He N ml ort oc k-S h Pa ilve cific r F Me ir F sic ore st Me Nor sic th P Su ac ba ific lpi ne Mar Pa itim rkl e an d Hectares Actual Area (est. from Inventory Plots) 80, 000 60, 000 40, 000 20, 000 0
Do Nor ug th P las -fir acifi -W c M es ter aritim n. H e em Dry loc -M Mi Me k F esi xe dit d C err ore c on ane st ife r F an C ore ali st forn N an Do ort d W ia M ug h P oo esic las ac dla -fir ific nd -W M es arit ter im n. H e. M em es loc ic-W k. F ore et st Mo un tai n H No N em rth o We rt loc Pac h ste k F ific rn Paci ore He fic st ml Dry oc M k-D e ou sic S Me gla d ite Mi s-f ilver rra xe ir F Fi d C ne ore ron an st C ife r F alifo ore rn ia st an Dry d W -M Do oo esic ug dla las nd -fir Fo No res rth ta nd Pacif Wo ic D We od ry ste lan rn d He N ml ort oc k-S h Pa ilve cific r F Me ir F sic ore st Me Nor sic th P Su ac ba ific lpi ne Mar Pa itim rkl e an d Hectares Random Forest No Imagery 80, 000 60, 000 40, 000 20, 000 0
Do Nor ug th P las -fir acifi -W c M es ter aritim n. H e em Dry loc -M Mi Me k F esi xe dit d C err ore c on ane st ife r F an C ore ali st forn N an Do ort d W ia M ug h P oo esic las ac dla -fir ific nd -W M es arit ter im n. H e. M em es loc ic-W k. F ore et st Mo un tai n H No N em rth o We rt loc Pac h ste k F ific rn Paci ore He fic st ml Dry oc M k-D e ou sic S Me gla d ite Mi s-f ilver rra xe ir F Fi d C ne ore ron an st C ife r F alifo ore rn ia st an Dry d W -M Do oo esic ug dla las nd -fir Fo No res rth ta nd Pacif Wo ic D We od ry ste lan rn d He N ml ort oc k-S h Pa ilve cific r F Me ir F sic ore st Me Nor sic th P Su ac ba ific lpi ne Mar Pa itim rkl e an d Hectares Random Forest With Imagery 80, 000 60, 000 40, 000 20, 000 0
Do Nor ug th P las -fir acifi -W c M es ter aritim n. H e em Dry loc -M Mi Me k F esi xe dit d C err ore c on ane st ife r F an C ore ali st forn N an Do ort d W ia M ug h P oo esic las ac dla -fir ific nd -W M es arit ter im n. H e. M em es loc ic-W k. F ore et st Mo un tai n H No N em rth o We rt loc Pac h ste k F ific rn Paci ore He fic st ml Dry oc M k-D e ou sic S Me gla d ite Mi s-f ilver rra xe ir F Fi d C ne ore ron an st C ife r F alifo ore rn ia st an Dry d W -M Do oo esic ug dla las nd -fir Fo No res rth ta nd Pacif Wo ic D We od ry ste lan rn d He N ml ort oc k-S h Pa ilve cific r F Me ir F sic ore st Me Nor sic th P Su ac ba ific lpi ne Mar Pa itim rkl e an d Hectares Random Forest Adjusted No Imagery 80, 000 60, 000 40, 000 20, 000 0
Do Nor ug th P las -fir acifi -W c M es ter aritim n. H e em Dry loc -M Mi Me k F esi xe dit d C err ore c on ane st ife r F an C ore ali st forn N an Do ort d W ia M ug h P oo esic las ac dla -fir ific nd -W M es arit ter im n. H e. M em es loc ic-W k. F ore et st Mo un tai n H No N em rth o We rt loc Pac h ste k F ific rn Paci ore He fic st ml Dry oc M k-D e ou sic S Me gla d ite Mi s-f ilver rra xe ir F Fi d C ne ore ron an st C ife r F alifo ore rn ia st an Dry d W -M Do oo esic ug dla las nd -fir Fo No res rth ta nd Pacif Wo ic D We od ry ste lan rn d He N ml ort oc k-S h Pa ilve cific r F Me ir F sic ore st Me Nor sic th P Su ac ba ific lpi ne Mar Pa itim rkl e an d Hectares Random Forest Adjusted With Imagery 80, 000 60, 000 40, 000 20, 000 0
Do Nor ug th P las -fir acifi -W c M es ter aritim n. H e em Dry loc -M Mi Me k F esi xe dit d C err ore c on ane st ife r F an C ore ali st forn N an Do ort d W ia M ug h P oo esic las ac dla -fir ific nd -W M es arit ter im n. H e. M em es loc ic-W k. F ore et st Mo un tai n H No N em rth o We rt loc Pac h ste k F ific rn Paci ore He fic st ml Dry oc M k-D e ou sic S Me gla d ite Mi s-f ilver rra xe ir F Fi d C ne ore ron an st C ife r F alifo ore rn ia st an Dry d W -M Do oo esic ug dla las nd -fir Fo No res rth ta nd Pacif Wo ic D We od ry ste lan rn d He N ml ort oc k-S h Pa ilve cific r F Me ir F sic ore st Me Nor sic th P Su ac ba ific lpi ne Mar Pa itim rkl e an d Hectares GNN No Imagery 80, 000 60, 000 40, 000 20, 000 0
Do Nor ug th P las -fir acifi -W c M es ter aritim n. H e em Dry loc -M Mi Me k F esi xe dit d C err ore c on ane st ife r F an C ore ali st forn N an Do ort d W ia M ug h P oo esic las ac dla -fir ific nd -W M es arit ter im n. H e. M em es loc ic-W k. F ore et st Mo un tai n H No N em rth o We rt loc Pac h ste k F ific rn Paci ore He fic st ml Dry oc M k-D e ou sic S Me gla d ite Mi s-f ilver rra xe ir F Fi d C ne ore ron an st C ife r F alifo ore rn ia st an Dry d W -M Do oo esic ug dla las nd -fir Fo No res rth ta nd Pacif Wo ic D We od ry ste lan rn d He N ml ort oc k-S h Pa ilve cific r F Me ir F sic ore st Me Nor sic th P Su ac ba ific lpi ne Mar Pa itim rkl e an d Hectares GNN with Imagery 80, 000 60, 000 40, 000 20, 000 0
Random Forest Unadjusted No Imagery
Random Forest Adjusted No Imagery
Random Forest Unadjusted With Imagery
Random Forest Adjusted With Imagery
GNN No Imagery
GNN With Imagery
Top 5 Variables RF RF _ TM GNN_TM Y Y Mean annual temperature Annual short-wave radiation Annual short wave radiation December minimum temperature Summer moisture stress X Elevation Summer moisture stress Annual vapor pressure Summer temperature Summer moisture stress First TM Variables: Median Filtered Tasseled Cap axis 2 (26 out of 36) Median Filtered LANDSAT band 3 (13 out of 23)
X X X RF Accuracy OK RF_ADJ Accuracy OK RF_TM Best Accuracy RF_ADJ_TM Accuracy Good GNN Accuracy OK GNN_TM Least accurate Area lousy Area OK Area Good Area good Coarse-grained Fine-grained Incorporates Imagery No Imagery
Conclusions • Buyer Beware. – The patterns in a map are at least partly a function of model choice. • The most appropriate map depends upon intended application. – Importance of area estimations vs. incorporation of imagery – For some applications, the GNN base-map may be better. • We chose RF_Adj_TM, because it balanced a variety of concerns well.
Acknowledgements: • USGS GAP analysis program • LEMMA research group at Oregon State University • Jimmy Kagan – reality-check and systems identification • Brendan Ward – programming help Landscape Ecology Modeling Mapping & Analysis
- Slides: 39