One of the most challenging research subjects in remote sensing is feature extraction, such as road features, from remote sensing images. Such an extraction influences multiple scenes, including map updating, traffic management, emergency tasks, road monitoring, and others. Therefore, a systematic review of deep learning techniques applied to common remote sensing benchmarks for road extraction is conducted in this study. The research is conducted based on four main types of deep learning methods, namely, the GANs model, deconvolutional networks, FCNs, and patch-based CNNs models. We also compare these various deep learning models applied to remote sensing datasets to show which method performs well in extracting road parts from high-resolution remote sensing images. Moreover, we describe future research directions and research gaps. Results indicate that the largest reported performance record is related to the deconvolutional nets applied to remote sensing images, and the F1 score metric of the generative adversarial network model, DenseNet method, and FCN-32 applied to UAV and Google Earth images are high: 96.08%, 95.72%, and 94.59%, respectively.
One of the most important tasks in the advanced transportation systems is road extraction. Extracting road region from high-resolution remote sensing imagery is challenging due to complicated background such as buildings, trees shadows, pedestrians and vehicles and rural road networks that have heterogeneous forms with low interclass and high intraclass differences. Recently, deep learning-based techniques have presented a notable enhancement in the image segmentation results, however, most of them still cannot preserve boundary information and obtain high-resolution road segmentation map when processing the remote sensing imagery. In the present study, we introduce a new deep learning-based convolutional network called VNet model to produce a high-resolution road segmentation map. Moreover, a new dual loss function called cross-entropy-dice-loss (CEDL) is defined that synthesize cross-entropy (CE) and dice loss (DL) and consider both local information (CE) and global information (DL) to decrease the class imbalance influence and improve the road extraction results. The proposed VNet+CEDL model is implemented on two various road datasets called Massachusetts and Ottawa datasets. The suggested VNet+CEDL approach achieved an average F1 accuracy of 90.64% for Massachusetts dataset and 92.41% for Ottawa dataset. When compared to other state-of-the-art deep learning-based frameworks like FCN, Segnet and Unet, the proposed approach could improve the results to 1.09%, 2.45% and 0.39%, for Massachusetts dataset and 7.21%, 1.86% and 2.68%, for Ottawa dataset. Also, we compared the proposed method with the state-of-the-art road extraction techniques, and the results proved that the proposed technique outperformed other deep learning-based techniques in road extraction.
Building extraction with high accuracy using semantic segmentation from high-resolution remotely sensed imagery has a wide range of applications like urban planning, updating of geospatial database, and disaster management. However, automatic building extraction with non-noisy segmentation map and obtaining accurate boundary information is a big challenge for most of the popular deep learning methods due to the existence of some barriers like cars, vegetation cover and shadow of trees in the high-resolution remote sensing imagery. Thus, we introduce an end-to-end convolutional neural network called Generative Adversarial Network (GAN) in this study to tackle these issues. In the generative model, we utilized SegNet model with Bi-directional Convolutional LSTM (BConvLSTM) to generate the segmentation map from Massachusetts building dataset containing high-resolution aerial imagery. BConvLSTM combines encoded features (containing of more local information) and decoded features (containing of more semantic information) to improve the performance of the model even with the presence of complex backgrounds and barriers. The adversarial training method enforces long-range spatial label vicinity to tackle with the issue of covering building objects with the existing occlusions such as trees, cars and shadows and achieve high-quality building segmentation outcomes under the complex areas. The quantitative results obtained by the proposed technique with an average F1-score of 96.81% show that the suggested approach could achieve better results through detecting and adjusting the difference between the segmentation model output and the reference map compared to other state-of-the-art approaches such as autoencoder method with 91.36%, SegNet+BConvLSTM with 95.96%, FCN-CRFs with 95.36%% SegNet with 94.77%, and GAN-SCA model with 96.36% accuracy.
Urban vegetation mapping is critical in many applications, i.e., preserving biodiversity, maintaining ecological balance, and minimizing the urban heat island effect. It is still challenging to extract accurate vegetation covers from aerial imagery using traditional classification approaches, because urban vegetation categories have complex spatial structures and similar spectral properties. Deep neural networks (DNNs) have shown a significant improvement in remote sensing image classification outcomes during the last few years. These methods are promising in this domain, yet unreliable for various reasons, such as the use of irrelevant descriptor features in the building of the models and lack of quality in the labeled image. Explainable AI (XAI) can help us gain insight into these limits and, as a result, adjust the training dataset and model as needed. Thus, in this work, we explain how an explanation model called Shapley additive explanations (SHAP) can be utilized for interpreting the output of the DNN model that is designed for classifying vegetation covers. We want to not only produce high-quality vegetation maps, but also rank the input parameters and select appropriate features for classification. Therefore, we test our method on vegetation mapping from aerial imagery based on spectral and textural features. Texture features can help overcome the limitations of poor spectral resolution in aerial imagery for vegetation mapping. The model was capable of obtaining an overall accuracy (OA) of 94.44% for vegetation cover mapping. The conclusions derived from SHAP plots demonstrate the high contribution of features, such as Hue, Brightness, GLCM_Dissimilarity, GLCM_Homogeneity, and GLCM_Mean to the output of the proposed model for vegetation mapping. Therefore, the study indicates that existing vegetation mapping strategies based only on spectral characteristics are insufficient to appropriately classify vegetation covers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.