Land cover map of southern hemisphere Africa using

Framework and objective • The main task is to produce a 1 km spatial

Methods • Spectral land cover caracterization using 12 monthly composited dataset and an ISODATA

Unsupervised classification • The ISODATA unsupervised classification was based on the 36 composited classes.

Ancillary maps • The selection of the training areas was also based in the

Methods • Determine spectral classes obtainable with the 12 month data set. An ISODATA

Methods • Evaluate the labeling of land cover classes, comparing the land cover map

Results - Land cover maps, before and after the filtering (Before filtering) GLC 2000

Results – visual interpretation (Before filtering) GLC 2000 Workshop 24 -26 March 2003 (After

Results – visual interpretation (Before filtering) (After filtering) (Fitogeographic map) Grassland Tree Savanna Shrub

Results – visual errors detection (Before filtering) GLC 2000 Workshop 24 -26 March 2003

Results – accuracy assessment of the tree classifier Class Designation Percent of cleaned pixels

Discussion • Labeling error can occur for several reasons including subjectivity, data entry error

Current status and future development • Trying to improve the land cover map will

Slides: 19

Download presentation

Land cover map of southern hemisphere Africa using SPOT-4 VEGETATION data Ana Cabral 1, Maria J. P. de Vasconcelos 1, 2, José M. C. Pereira 1, 2, Étienne Bartholomé 3 and Philippe Mayaux 3 1 Centro de Cartografia, Instituto de Investigação Científica Tropical, Portugal 2 Centro 3 Institute GLC 2000 Workshop 24 -26 March 2003 de Estudos Florestais, ISA, Portugal for Environment and Sustainability, Joint Research Center, EC

Framework and objective • The main task is to produce a 1 km spatial resolution land cover map of southern hemisphere Africa. • Our strategy is to work with twelve monthly composite images derived from S 1 data • Land cover map legend follows the FAO/ LCCS GLC 2000 Workshop 24 -26 March 2003

Methods • Spectral land cover caracterization using 12 monthly composited dataset and an ISODATA unsupervised classifier. GLC 2000 Workshop 24 -26 March 2003

Unsupervised classification • The ISODATA unsupervised classification was based on the 36 composited classes. The channels, 40 using spectral forty classes identified were used to guide in the selection of the training areas for land cover map legend. Unsupervised classification map produced from the twelve monthly composite data GLC 2000 Workshop 24 -26 March 2003

Methods • Spectral land cover caracterization using 12 monthly composited dataset and an ISODATA unsupervised classifier. • Use of the obtained land cover map classification along with Landsat images and ancillary maps to guide the selection of training areas for classes of the legend categories defined with LCCS. GLC 2000 Workshop 24 -26 March 2003

Ancillary maps • The selection of the training areas was also based in the White vegetation map (White, 1983), other land cover maps, Landsat images and fitogeographic data. GLC 2000 Workshop 24 -26 March 2003

Methods • Determine spectral classes obtainable with the 12 month data set. An ISODATA classification was done with 40 classes. • Use of the obtained land cover map classification along with Landsat images and ancillary maps to guide the selection of training areas for classes of the legend categories defined with LCCS. • Spectral values were extracted from the training areas, and used in a classification tree algorithm (CART) (Breiman et al, 1984). The parameters used were: • – Priors were set equal. – Twoing index criterion – Number of observations in terminal nodes were equal to 30. – Linear combinations between variables were allowed. – 10 -fold cross validation. The rules generated in the classification tree were used to build the land cover map. GLC 2000 Workshop 24 -26 March 2003

Methods • Evaluate the labeling of land cover classes, comparing the land cover map with the training areas. • Filter the training dataset (Brodley, C. , 1999), removing the wrong label pixels, and fitting to data a new classification tree. • The new classification tree was applied to filtered data in order to produce the final land cover map. GLC 2000 Workshop 24 -26 March 2003

Results - Land cover maps, before and after the filtering (Before filtering) GLC 2000 Workshop 24 -26 March 2003 (after filtering)

Results – visual interpretation (Before filtering) GLC 2000 Workshop 24 -26 March 2003 (After filtering) (Land cover map)

Results – visual interpretation (Before filtering) GLC 2000 Workshop 24 -26 March 2003 (After filtering) (Land cover map) (Landsat image)

Results – visual interpretation (Before filtering) (After filtering) (Fitogeographic map) Grassland Tree Savanna Shrub savanna Forest – Dry evergreen Woodland GLC 2000 Workshop 24 -26 March 2003

Results – visual interpretation (Before filtering) GLC 2000 Workshop 24 -26 March 2003 (After filtering) (Land cover map)

Results – visual interpretation (Before filtering) GLC 2000 Workshop 24 -26 March 2003 (After filtering) (Landsat image)

Results – visual errors detection (Before filtering) GLC 2000 Workshop 24 -26 March 2003 (After filtering) (Spot – 4 Vegetation image) (R-G-B, Mir-IR-R)

Results – accuracy assessment of the tree classifier Class Designation Percent of cleaned pixels Percent correct (before filtering Percent correct (after filtering) 1 Mosaic evergreen forest-semi-deciduous forest 15. 2 79. 79 82. 04 2 Mosaic croplands-grasslands 3. 96 82. 17 84. 53 3 Sparse grassland 12. 3 80. 08 85. 84 4 Mosaic evergreen forest-open grassland 10. 4 87. 03 91. 00 5 Mixed forest 6. 1 84. 36 80. 81 6 Mangrove 2. 2 93. 33 98. 86 7 Mosaic deciduous forest-open grassland-evergreen forest 3. 01 94. 36 94. 39 8 Mosaic deciduous forest-open grassland 7. 3 85. 39 87. 74 9 Salt hardpans 7. 1 92. 06 93. 37 10 Sparse shrubs 1. 3 90. 81 93. 10 11 Mosaic open forest-open shrubs 5. 9 92. 47 92. 57 12 Mosaic deciduous forest-open shrubs-open grassland 7. 7 81. 98 83. 93 13 Mosaic open forest-open grassland with shrubs 13. 4 83. 48 84. 63 14 Swamp bushland grassland 2. 8 92. 38 92. 64 15 Waterbodies 2. 9 94. 86 96. 68 GLC 2000 Workshop 24 -26 March 2003

Results – accuracy assessment of the tree classifier Class Designation Percent of cleaned pixels Percent correct (before filtering) Percent correct (after filtering) 16 Bare soil 2. 2 95. 93 95. 38 17 Mosaic closed shrubland-open forest 10. 3 83. 48 85. 52 18 Open shrubs 14. 3 84. 98 85. 65 19 Broadleaved deciduous shrubland 5. 03 94. 57 95. 76 20 Closed shrubland 33. 2 68. 09 72. 88 21 Closed grassland with shrubs 1. 9 87. 95 94. 28 22 Closed grassland 10. 0 77. 83 80. 77 23 Closed evergreen forest 6. 7 87. 69 87. 53 24 Closed grassland with sparse trees 8. 8 84. 79 86. 43 25 Open grassland 19. 7 77. 83 83. 94 26 Agriculture 22. 1 75. 21 84. 32 27 Mosaic croplands-woody vegetation 19. 2 69. 28 74. 18 28 Closed deciduous woodland 30. 1 65. 23 70. 77 29 Open deciduous woodland 8. 5 84. 59 85. 31 30 Mosaic open forest-open grassland 2. 6 88. 54 95. 47 GLC 2000 Workshop 24 -26 March 2003

Discussion • Labeling error can occur for several reasons including subjectivity, data entry error or inadequacy of the information used to label each training area. • Improving the quality of training data by identifying and eliminating mislabeled pixels prior to applying the classification tree algorithm increase classification accuracy, and improve the quality of training data. • Removing mislabeled pixels from the training data resulted in a higher predictive accuracy relative to classification accuracies achieved without “cleaning” the training data. • Visual interpretation shows that after the filtering process the land cover map has improved. However, some visual errors were detected, mainly in agriculture class. GLC 2000 Workshop 24 -26 March 2003

Current status and future development • Trying to improve the land cover map will be done with an emsemble of trees for voting, where the final map is obtained through a majority voting scheme: • Randomly select 10 mutually exclusive testing sets with 10% of the training data each. • Produce 10 classification trees (using 10 -fold crossvalidation) with the remaining 90% of data. • Classify each of the 10 test sets (which correspond to the total initial data set) with the 10 classification trees and select records by majority voting. • Build the final land cover map with the class most voted. • Validation – The accuracy assessment will be derived from a confusion matrix, that will show a simple cross-tabulation of the mapped class label against the reference data. Data for accuracy assessment will be collected using a systematic sampling scheme. Validation data is not independent since it was also used to collect trianing data. GLC 2000 Workshop 24 -26 March 2003