Remote Sensing Scene Classification by Local–Global Mutual Learning

Chen, Xiumei; Zheng, Xiangtao; Zhang, Yue; Lu, Xiaoqiang

doi:10.1109/lgrs.2022.3150801

Cited by 13 publications

(6 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Emerging deep learning methods such as graph convolutional networks (GCNs) [34], neural architecture search (NAS) [35], generative adversarial networks (GANs) [36], local-global learning [37] [38] and others [39]- [41] have also been used in scene classification. Xu et al [42] design a deep feature aggregation framework based on GCN.…”

Section: Deep Learning-based Remote Sensing Scene Classificationmentioning

confidence: 99%

“…In order to improve global representation of CNNs, Lv et al [37] propose the local-global-fusion feature extraction network, which leverages RNNs to capture contextual information. And Chen et al [38] propose the local-global mutual learning (LML) method to obtain different features and learn from each other through KL. However, they are still difficult to improve the extraction of CNN for long-range features.…”

Section: Deep Learning-based Remote Sensing Scene Classificationmentioning

confidence: 99%

See 1 more Smart Citation

Local and Long-Range Collaborative Learning for Remote Sensing Scene Classification

Zhao

Meng

Zhang

et al. 2023

IEEE Trans. Geosci. Remote Sensing

View full text Add to dashboard Cite

With the development of high-resolution satellites, more and more attention has been paid to remote sensing (RS) scene classification. Convolutional neural networks (CNNs), which replace the traditional handcrafted features with a learning-based feature extraction mechanism, are widely used in scene classification. But CNNs are less effective in deriving long-range contextual relations, which limits the further improvement. Visual transformer (VT), an emerging image processing method, provides a new perspective for RS scene classification by directly acquiring long-range features. Although there have been limited works combining CNN and VT through simple concatenation, the collaborations between them are insufficient. To address these issues, we propose a local and long-range collaborative framework (L2RCF). First, we design a dual-stream structure to extract the local and long-range features. Second, a cross-feature calibration (CFC) module is designed for them to improve representation of the fusion features. Then, combining deep supervision (DS) and deep mutual learning (DML), a novel joint loss is proposed to enhance the dual-stream feature extractor and further improve the fused features. Finally, a two-stage semi-supervised training strategy is designed to improve performance with unlabeled samples. To demonstrate the effectiveness of L2RCF, we conducted experiments on three widely used RS scene classification data sets: RSSCN7, AID, and NWPU. The results show that L2RCF performs significantly better compared with some state-of-the-art scene classification methods.

show abstract

Section: Deep Learning-based Remote Sensing Scene Classificationmentioning

confidence: 99%

Section: Deep Learning-based Remote Sensing Scene Classificationmentioning

confidence: 99%

Local and Long-Range Collaborative Learning for Remote Sensing Scene Classification

Zhao

Meng

Zhang

et al. 2023

IEEE Trans. Geosci. Remote Sensing

View full text Add to dashboard Cite

show abstract

“…Among these deep learning based methods, CNNs are the most commonly-utilized [2], [18]- [21], [44] as the convolutional filters are effective to extract multi-level features from the image. In the past two years, CNN based methods (e.g., DSENet [45], MS2AP [46], MSDFF [47], CADNet [48], LSENet [5], GBNet [49], MBLANet [50], MG-CAP [51], Contourlet CNN [52], STHP [53], SAGM [54], DARTS [55], LML [56], GCSANet [57]) still remain heated for aerial scene classification. On the other hand, recurrent neural network (RNN) based [25], auto-encoder based [58], [59] and generative adversarial network (GAN) based [60], [61] approaches have also been reported effective for aerial scene classification.…”

Section: A Aerial Scene Classificationmentioning

confidence: 99%

“…We compare the performance of our AGOS with three handcrafted features (PLSA, BOW, LDA) [17], [87], three typical CNN models (AlexNet, VGG, GoogLeNet) [17], [87], twentytwo latest CNN-based state-of-the-art approaches (MIDCNet [2], RANet [29], APNet [88], SPPNet [20], DCNN [28], TEXNet [89], MSCP [18], VGG+FV [21], DSENet [45], MS2AP [46], MSDFF [47], CADNet [48], LSENet [5], GBNet [49], MBLANet [50], MG-CAP [51], Contourlet CNN [52], STHP [53], SAGM [54], DARTS [55], LML [56], GCSANet [57]), one RNN-based approach (ARCNet [25]), two autoencoder based approaches (SGUFL [59], PARTLETS [58]) and two GAN-based approaches (MARTA [60], AGAN [61]) respectively. The performance under the backbone of ResNet-50, ResNet-101 and DenseNet-121 is all reported for fair evaluation as some latest methods [47], [48] use much deeper networks as backbone.…”

Section: Comparison With State-of-the-art Approachesmentioning

confidence: 99%

All Grains, One Scheme (AGOS): Learning Multigrain Instance Representation for Aerial Scene Classification

Zhou

Qin

et al. 2022

IEEE Trans. Geosci. Remote Sensing

View full text Add to dashboard Cite

Aerial scene classification remains challenging as: 1) the size of key objects in determining the scene scheme varies greatly; 2) many objects irrelevant to the scene scheme are often flooded in the image. Hence, how to effectively perceive the region of interests (RoIs) from a variety of sizes and build more discriminative representation from such complicated object distribution is vital to understand an aerial scene. In this paper, we propose a novel all grains, one scheme (AGOS) framework to tackle these challenges. To the best of our knowledge, it is the first work to extend the classic multiple instance learning into multi-grain formulation. Specially, it consists of a multigrain perception module (MGP), a multi-branch multi-instance representation module (MBMIR) and a self-aligned semantic fusion (SSF) module. Firstly, our MGP preserves the differential dilated convolutional features from the backbone, which magnifies the discriminative information from multi-grains. Then, our MBMIR highlights the key instances in the multi-grain representation under the MIL formulation. Finally, our SSF allows our framework to learn the same scene scheme from multi-grain instance representations and fuses them, so that the entire framework is optimized as a whole. Notably, our AGOS is flexible and can be easily adapted to existing CNNs in a plug-andplay manner. Extensive experiments on UCM, AID and NWPU benchmarks demonstrate that our AGOS achieves a comparable performance against the state-of-the-art methods.

show abstract

“…To improve representational power, multi-branch methods employ multi-branch architecture to consider some different inputs such as multi-scale of an image [13], [14], or different images [15], [16]. Wang et al [17] proposed a multiscale representation by a global local dual-branch architecture.…”

Section: Cnn Cnnmentioning

confidence: 99%

Pairwise Comparison Network for Remote-Sensing Scene Classification

Zhang

Zheng²,

2022

IEEE Geosci. Remote Sensing Lett.

Self Cite

View full text Add to dashboard Cite

Remote sensing scene classification aims to assign a specific semantic label to a remote sensing image. Recently, convolutional neural networks have greatly improved the performance of remote sensing scene classification. However, some confused images may be easily recognized as the incorrect category, which generally degrade the performance. The differences between image pairs can be used to distinguish image categories. This paper proposed a pairwise comparison network, which contains two main steps: pairwise selection and pairwise representation. The proposed network first selects similar image pairs, and then represents the image pairs with pairwise representations. The self-representation is introduced to highlight the informative parts of each image itself, while the mutualrepresentation is proposed to capture the subtle differences between image pairs. Comprehensive experimental results on two challenging datasets (AID, NWPU-RESISC45) demonstrate the effectiveness of the proposed network. The codes are provided in https://github.com/spectralpublic/PCNet.git.

show abstract

Remote Sensing Scene Classification by Local–Global Mutual Learning

Cited by 13 publications

References 15 publications

Local and Long-Range Collaborative Learning for Remote Sensing Scene Classification

Local and Long-Range Collaborative Learning for Remote Sensing Scene Classification

All Grains, One Scheme (AGOS): Learning Multigrain Instance Representation for Aerial Scene Classification

Pairwise Comparison Network for Remote-Sensing Scene Classification

Contact Info

Product

Resources

About