Information about the location and extent of informal settlements is necessary to guide decision making and resource allocation for their upgrading. Very high resolution (VHR) satellite images can provide this useful information, however, different urban settlement types are hard to be automatically discriminated and extracted from VHR imagery, because of their abstract semantic class definition. State-of-the-art classification techniques rely on hand-engineering spatial-contextual features to improve the classification results of pixel-based methods. In this paper, we propose to use convolutional neural networks (CNNs) for learning discriminative spatial features, and perform automatic detection of informal settlements. The experimental analysis is carried out on a QuickBird image acquired over Dar es Salaam, Tanzania. The proposed technique is compared against support vector machines (SVMs) using texture features extracted from grey level co-occurrence matrix (GLCM) and local binary patterns (LBP), which result in accuracies of 86.65% and 90.48%, respectively. CNN leads to better classification, resulting in an overall accuracy of 91.71%. A sensitivity analysis shows that deeper networks result in higher accuracies when large training sets are used. The study concludes that training CNN in an end-to-end fashion can automatically learn spatial features from the data that are capable of discriminating complex urban land use classes.
Accurate spatial information of agricultural fields in smallholder farms is important for providing actionable information to farmers, managers, and policymakers. Very High Resolution (VHR) satellite images can capture such information. However, the automated delineation of fields in smallholder farms is a challenging task because of their small size, irregular shape and the use of mixed-cropping systems, which make their boundaries vaguely defined. Physical edges between smallholder fields are often indistinct in satellite imagery and contours need to be identified by considering the transition of the complex textural pattern between fields. In these circumstances, standard edge-detection algorithms fail to extract accurate boundaries. This article introduces a strategy to detect field boundaries using a fully convolutional network in combination with a globalisation and grouping algorithm. The convolutional network using an encoder-decoder structure is capable of learning complex spatial-contextual features from the image and accurately detects sparse field contours. A hierarchical segmentation is derived from the contours using the oriented watershed transform and by iteratively merging adjacent regions based on the average strength of their common boundary. Finally, field segments are obtained by adopting a combinatorial grouping algorithm exploiting the information of the segmentation hierarchy. An extensive experimental analysis is performed in two study areas in Nigeria and Mali using WorldView-2/3 images and comparing several state-of-the-art contour detection algorithms. The algorithms are compared based on the precision-recall accuracy assessment strategy which is tolerating small localisation errors in the detected contours. The proposed strategy shows promising results by automatically delineating field boundaries with F-scores higher than 0.7 and 0.6 on our two test areas, respectively, outperforming alternative techniques.
Classification of very high resolution (VHR) satellite images has three major challenges: 1) inherent low intra-class and high inter-class spectral similarities, 2) mismatching resolution of available bands, and 3) the need to regularize noisy classification maps. Conventional methods have addressed these challenges by adopting separate stages of image fusion, feature extraction, and post-classification map regularization. These processing stages, however, are not jointly optimizing the classification task at hand. In this study, we propose a single-stage framework embedding the processing stages in a recurrent multiresolution convolutional network trained in an end-to-end manner. The feedforward version of the network, called FuseNet, aims to match the resolution of the panchromatic and multispectral bands in a VHR image using convolutional layers with corresponding downsampling and upsampling operations. Contextual label information is incorporated into FuseNet by means of a recurrent version called ReuseNet. We compared FuseNet and ReuseNet against the use of separate processing steps for both image fusion, e.g. pansharpening and resampling through interpolation, and map regularization such as conditional random fields. We carried out our experiments on a land cover classification task using a Worldview-03 image of Quezon City, Philippines and the ISPRS 2D semantic labeling benchmark dataset of Vaihingen, Germany. FuseNet and ReuseNet surpass the baseline approaches in both quantitative and qualitative results.Index Terms-Convolutional networks, recurrent networks, land cover classification, VHR image, deep learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.