RETRACTED: Attention-Based Deep Feature Fusion for the Scene Classification of High-Resolution Remote Sensing Images

Zhu, Jun; Yan, Victoria C.; Mo, H. J.; Liu, Ming

doi:10.3390/rs11171996

Cited by 36 publications

(29 citation statements)

References 64 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…CNN is widely used in remote sensing scene classification [21][22][23] because of its powerful feature extraction capability in various fields such as object tracking [24][25][26], detection [27], and classification [28]. Compared to fine-tuning existing models [29], many researchers focus on the design of the network model [2,13,30], aiming to obtain a higher classification accuracy via additional modules. For example [13], in addition to using Resnet [31] to extract feature maps from the image, Zhang et al [13] designed a convolutional network to extract feature maps from attention maps [32].…”

Section: Motivationmentioning

confidence: 99%

“…Compared to fine-tuning existing models [29], many researchers focus on the design of the network model [2,13,30], aiming to obtain a higher classification accuracy via additional modules. For example [13], in addition to using Resnet [31] to extract feature maps from the image, Zhang et al [13] designed a convolutional network to extract feature maps from attention maps [32]. These feature maps will be fused with the feature maps extracted by Resnet for classification.…”

Section: Motivationmentioning

confidence: 99%

“…Therefore, many researchers pay attention to this field [1,2,[7][8][9][10][11][12]. At present, many researchers use neural network technology to model remote sensing images and learn a nonlinear input-output mapping model from massive remote sensing data, thus realizing an automatic classification system of remote sensing scenes [1,2,13].…”

Section: Introductionmentioning

confidence: 99%

“…Some researchers use a support vector machine (SVM) [18] as a classifier to classify the features extracted by the neural network [19,20]. Unlike using SVM, an end-to-end model can be constructed by using a fully connected layer for classification [2,13].…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Training Convolutional Neural Networks with Multi-Size Images and Triplet Loss for Remote Sensing Scene Classification

Zhang

Wang

et al. 2020

Sensors

View full text Add to dashboard Cite

Many remote sensing scene classification algorithms improve their classification accuracy by additional modules, which increases the parameters and computing overhead of the model at the inference stage. In this paper, we explore how to improve the classification accuracy of the model without adding modules at the inference stage. First, we propose a network training strategy of training with multi-size images. Then, we introduce more supervision information by triplet loss and design a branch for the triplet loss. In addition, dropout is introduced between the feature extractor and the classifier to avoid over-fitting. These modules only work at the training stage and will not bring about the increase in model parameters at the inference stage. We use Resnet18 as the baseline and add the three modules to the baseline. We perform experiments on three datasets: AID, NWPU-RESISC45, and OPTIMAL. Experimental results show that our model combined with the three modules is more competitive than many existing classification algorithms. In addition, ablation experiments on OPTIMAL show that dropout, triplet loss, and training with multi-size images improve the overall accuracy of the model on the test set by 0.53%, 0.38%, and 0.7%, respectively. The combination of the three modules improves the overall accuracy of the model by 1.61%. It can be seen that the three modules can improve the classification accuracy of the model without increasing model parameters at the inference stage, and training with multi-size images brings a greater gain in accuracy than the other two modules, but the combination of the three modules will be better.

show abstract

Section: Motivationmentioning

confidence: 99%

Section: Motivationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Training Convolutional Neural Networks with Multi-Size Images and Triplet Loss for Remote Sensing Scene Classification

Zhang

Wang

et al. 2020

Sensors

View full text Add to dashboard Cite

show abstract

“…Compared with traditional feature acquisition methods, the pre-trained ConvNets can extract more representative features. Therefore, the authors extracted the fully connected layer features of remote sensing images from pre-trained ConvNets to classify scenes in [36][37][38]. The literature [39][40][41] used the traditional feature coding methods [42,43] to encode and classify features of convolutional layers.…”

mentioning

confidence: 99%

A Multi-Scale Approach for Remote Sensing Scene Classification Based on Feature Maps Selection and Region Representation

Zhang

Shi

et al. 2019

Remote Sensing

View full text Add to dashboard Cite

Scene classification is one of the bases for automatic remote sensing image interpretation. Recently, deep convolutional neural networks have presented promising performance in high-resolution remote sensing scene classification research. In general, most researchers directly use raw deep features extracted from the convolutional networks to classify scenes. However, this strategy only considers single scale features, which cannot describe both the local and global features of images. In fact, the dissimilarity of scene targets in the same category may result in convolutional features being unable to classify them into the same category. Besides, the similarity of the global features in different categories may also lead to failure of fully connected layer features to distinguish them. To address these issues, we propose a scene classification method based on multi-scale deep feature representation (MDFR), which mainly includes two contributions: (1) region-based features selection and representation; and (2) multi-scale features fusion. Initially, the proposed method filters the multi-scale deep features extracted from pre-trained convolutional networks. Subsequently, these features are fused via two efficient fusion methods. Our method utilizes the complementarity between local features and global features by effectively exploiting the features of different scales and discarding the redundant information in features. Experimental results on three benchmark high-resolution remote sensing image datasets indicate that the proposed method is comparable to some state-of-the-art algorithms.

show abstract

High Spatial Resolution Remote Sensing Classification with Lightweight CNN Using Dilated Convolution

Zhang

Dong

et al. 2021

Mobile Multimedia Communications

View full text Add to dashboard Cite

RETRACTED: Attention-Based Deep Feature Fusion for the Scene Classification of High-Resolution Remote Sensing Images

Cited by 36 publications

References 64 publications

Training Convolutional Neural Networks with Multi-Size Images and Triplet Loss for Remote Sensing Scene Classification

Training Convolutional Neural Networks with Multi-Size Images and Triplet Loss for Remote Sensing Scene Classification

A Multi-Scale Approach for Remote Sensing Scene Classification Based on Feature Maps Selection and Region Representation

High Spatial Resolution Remote Sensing Classification with Lightweight CNN Using Dilated Convolution

Contact Info

Product

Resources

About