Zhongyuan Lu scite author profile

A deep neural network is suitable for remote sensing image pixel-wise classification because it effectively extracts features from the raw data. However, remote sensing images with higher spatial resolution exhibit smaller inter-class differences and greater intra-class differences; thus, feature extraction becomes more difficult. The attention mechanism, as a method that simulates the manner in which humans comprehend and perceive images, is useful for the quick and accurate acquisition of key features. In this study, we propose a novel neural network that incorporates two kinds of attention mechanisms in its mask and trunk branches; i.e., control gate (soft) and feedback attention mechanisms, respectively, based on the branches’ primary roles. Thus, a deep neural network can be equipped with an attention mechanism to perform pixel-wise classification for very high-resolution remote sensing (VHRRS) images. The control gate attention mechanism in the mask branch is utilized to build pixel-wise masks for feature maps, to assign different priorities to different locations on different channels for feature extraction recalibration, to apply stress to the effective features, and to weaken the influence of other profitless features. The feedback attention mechanism in the trunk branch allows for the retrieval of high-level semantic features. Hence, additional aids are provided for lower layers to re-weight the focus and to re-update higher-level feature extraction in a target-oriented manner. These two attention mechanisms are fused to form a neural network module. By stacking various modules with different-scale mask branches, the network utilizes different attention-aware features under different local spatial structures. The proposed method is tested on the VHRRS images from the BJ-02, GF-02, Geoeye, and Quickbird satellites, and the influence of the network structure and the rationality of the network design are discussed. Compared with other state-of-the-art methods, our proposed method achieves competitive accuracy, thereby proving its effectiveness.

show abstract

DenseNet-Based Depth-Width Double Reinforced Deep Learning Neural Network for High-Resolution Remote Sensing Image Per-Pixel Classification

Tao

et al. 2018

Remote Sensing

View full text Add to dashboard Cite

Abstract:Deep neural networks (DNNs) face many problems in the very high resolution remote sensing (VHRRS) per-pixel classification field. Among the problems is the fact that as the depth of the network increases, gradient disappearance influences classification accuracy and the corresponding increasing number of parameters to be learned increases the possibility of overfitting, especially when only a small amount of VHRRS labeled samples are acquired for training. Further, the hidden layers in DNNs are not transparent enough, which results in extracted features not being sufficiently discriminative and significant amounts of redundancy. This paper proposes a novel depth-width-reinforced DNN that solves these problems to produce better per-pixel classification results in VHRRS. In the proposed method, densely connected neural networks and internal classifiers are combined to build a deeper network and balance the network depth and performance. This strengthens the gradients, decreases negative effects from gradient disappearance as the network depth increases and enhances the transparency of hidden layers, making extracted features more discriminative and reducing the risk of overfitting. In addition, the proposed method uses multi-scale filters to create a wider neural network. The depth of the filters from each scale is controlled to decrease redundancy and the multi-scale filters enable utilization of joint spatio-spectral information and diverse local spatial structure simultaneously. Furthermore, the concept of network in network is applied to better fuse the deeper and wider designs, making the network operate more smoothly. The results of experiments conducted on BJ02, GF02, geoeye and quickbird satellite images verify the efficacy of the proposed method. The proposed method not only achieves competitive classification results but also proves that the network can continue to be robust and perform well even while the amount of labeled training samples is decreasing, which fits the small training samples situation faced by VHRRS per-pixel classification.

show abstract

Crisscross-Global Vision Transformers Model for Very High Resolution Aerial Image Semantic Segmentation

Zhong

et al. 2023

IEEE Trans. Geosci. Remote Sensing

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhongyuan Lu

Attention-Mechanism-Containing Neural Networks for High-Resolution Remote Sensing Image Classification

DenseNet-Based Depth-Width Double Reinforced Deep Learning Neural Network for High-Resolution Remote Sensing Image Per-Pixel Classification

Crisscross-Global Vision Transformers Model for Very High Resolution Aerial Image Semantic Segmentation

Contact Info

Product

Resources

About