ABSTRACT:In this paper, we present a method for automatic refinement of training data. Many classifiers from machine learning used in applications in the remote sensing domain, rely on previously labelled training data. This labelling is often done by human operators and is bound to time constraints. Hence, selection of training data must be kept practical which implies a certain inaccuracy. This results in erroneously tagged regions enclosed within competing classes. For that purpose, we propose a method that removes outliers from training data by using an iterative training-classification scheme. Outliers are detected by their newly determined class membership as well as through analysis of uncertainty of classified samples. The sample selection method which incorporates quality of neighbouring samples is presented and compared to alternative strategies. Additionally, iterative approaches tend to propagate errors which might lead to degenerating classes. Therefore, a robust stopping criterion based on training data characteristics is described. Our experiments using a support vector machine (SVM) show, that outliers are reliably removed, allowing a more convenient sample selection. The classification result for unknown scenes of the accordant validation set improves from 70.36% to 79.12% on average. Additionally, the average complexity of the SVM model is decreased by 82.75% resulting in similar reduction of processing time.
ABSTRACT:As a consequence of the wide-spread application of digital geo-data in geographic information systems (GIS), quality control has become increasingly important to enhance the usefulness of the data. For economic reasons a high degree of automation is required for the quality control process. This goal can be achieved by automatic image analysis techniques. An example of how this can be achieved in the context of quality assessment of cropland and grassland GIS objects is given in this paper. The quality assessment of these objects of a topographic dataset is carried out based on multi-temporal information. The multi-temporal approach combines the channels of all available images as a multilayer image and applies a pixel-based SVM-classification. In this way multispectral as well as multi-temporal information is processed in parallel. The features used for the classification consist of spectral, textural (Haralick features) and structural (features derived from a semi-variogram) features. After the SVM-classification, the pixel-based result is mapped to the GIS-objects. Finally, a simple ruled-based approach is used in order to verify the objects of a GIS database. The approach was tested using a multi-temporal data set consisting of one 5-channel RapidEye image (GSD 5m) and two 3-channel Disaster Monitoring Constellation (DMC) images (GSD 32m). All images were taken within one year. The results show that by using our approach, quality control of GIS-cropland and grassland objects is possible and the human operator saves time using our approach compared to a completely manual quality assessment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.