CSDS: End-to-End Aerial Scenes Classification With Depthwise Separable Convolution and an Attention Mechanism

Wang, Xinyu; Yuan, Liming; Xu, Haixia; Wen, Xianbin

doi:10.1109/jstars.2021.3117857

Cited by 31 publications

(18 citation statements)

References 59 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They used two kinds of attention modules (channel and spatial attention modules) to explore the correlations between image pixels derived from the channel and spatial dimensions, respectively. Wang et al [13] combined a channel-spatial attention algorithm and DS-Conv to propose a lightweight network called CSDS for remote sensing image classification.…”

Section: Attention Mechanismsmentioning

confidence: 99%

See 1 more Smart Citation

A lightweight and stochastic depth residual attention network for remote sensing scene classification

Wang

Yuan

et al. 2023

IET Image Processing

Self Cite

View full text Add to dashboard Cite

Due to the rapid development of satellite technology, high‐spatial‐resolution remote sensing (HRRS) images have highly complex spatial distributions and multiscale features, making the classification of such images a challenging task. The key to scene classification is to accurately understand the main semantic information contained in images. Convolutional neural networks (CNNs) have outstanding advantages in this field. Deep CNNs (D‐CNNs) with better performance tend to have more parameters and higher complexity. However, shallow CNNs have difficulty extracting the key features of complex remote sensing images. In this paper, we propose a lightweight network with a random depth strategy for remote sensing scene classification (LRSCM). We construct a convolutional feature extraction module, DCAB, which incorporates depthwise separable convolutional and inverted residual structures, effectively reducing the numbers of required parameters and computations, and retains and utilizes low‐level features. In addition, coordinate attention (CA) is integrated into the module, thereby further improving the network's ability to extract key local information. To further reduce the complexity of model training, the residual module adopts a stochastic depth strategy, providing the network with a random depth. Comparative experiments on five public datasets show that the LRSCM network can achieve results comparable to those of other state‐of‐the‐art methods.

show abstract

Section: Attention Mechanismsmentioning

confidence: 99%

“…Many classic networks, such as the Visual Geometry Group Network (VGGNet), AlexNet [9], and GoogLeNet [2], have demonstrated strong feature extraction capabilities and have been applied to remote sensing image classification. Many improved methods based on classic networks have also achieved state-of-the-art performance [10][11][12][13].…”

Section: Introductionmentioning

confidence: 99%

A lightweight and stochastic depth residual attention network for remote sensing scene classification

Wang

Yuan

et al. 2023

IET Image Processing

Self Cite

View full text Add to dashboard Cite

show abstract

“…In FACNN, a supervised convolutional feature encoding module and a progressive aggregation strategy are proposed to aggregate intermediate features with semantic label information to achieve advanced classification accuracy. Wang et al 41 proposed a network for aerial scene classification based on deep separable convolution and spatial attention mechanism called CSDS. The model uses depth-separable convolution to extract features from channels and a residual pyramidal structure to connect and associate multiple layers of features with achieving state-of-the-art recognition accuracy.…”

Section: Classification Based On Depth Featuresmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Remote sensing scene image classification model based on multi-scale features and attention mechanism

Wang

et al. 2022

J. Appl. Rem. Sens.

Self Cite

View full text Add to dashboard Cite

Remote sensing scene classification has received more and more attention as important fundamental research in recent years. However, the redundant background information and complex spatial scale variability of remote sensing scene images make the existing convolutional neural network models, which mainly concentrate on global features, perform poorly. To effectively alleviate these problems, we proposed an MSRes-SplitNet model based on multiscale features and attention mechanisms for remote sensing scene image classification. First, MSRes blocks are constructed for the extraction of multi-scale features. Then, the multi-channel local features are fused by the Split-Attention block. Finally, the global and local feature information is aggregated by convolution, thus obtaining multi-scale features while alleviating the smallsample learning problem. Experiments are conducted on three publicly available datasets and compared with other state-of-the-art methods, showing that the proposed method MSRes-SplitNet has better performance while effectively reducing a large number of parameters.

show abstract

“…After the twenty-first century, with the continuous development and maturity of hyperspectral images (HSI) technology and related theories, it has broad application prospects in the field of grassland ecology 5 . Hyperspectral for parameter detection has the advantages of multiple bands, high sensitivity and non-destructiveness 6 , 7 . It facilitates grass classification with study at close range.…”

Section: Introductionmentioning

confidence: 99%

Visible-NIR hyperspectral classification of grass based on multivariate smooth mapping and extreme active learning approach

Zhao

Pan

Wei

et al. 2022

Sci Rep

View full text Add to dashboard Cite

Grass community classification is the basis for the development of animal husbandry and dynamic monitoring of environment, which has become a critical problem to further strengthen the intelligent management of grassland. Compared with grass survey based on satellite remote sensing, the visible near infrared (NIR) hyperspectral not only monitor dynamically in a short distance, but also have high dimensions and detailed spectral information in each pixel. However, the hyperspectral labeled sample for classification is expensive and manual selection is more subjective. In order to solve above limitations, we proposed a visible-NIR hyperspectral classification model for grass based on multivariate smooth mapping and extreme active learning (MSM–EAL). Firstly, MSM is used to preprocess and reconstruct the spectrum. Secondly, by jointing XGBoost and active learning (AL), the advanced samples with the largest amount of information are actively selected to improve the performance of target classification. Innovation lies in: (1) MSM global enhanced preprocessing spectral reconstruction algorithm is proposed, in which isometric feature mapping is effectively applied to the grass hyperspectral for the first time. (2) EAL framework is constructed to solve the issue of high cost and small number for hyperspectral labeled samples, at the same time, enhance the physical essence behind spectral classification more intuitively. A field hyperspectral collection platform is assembled to establish nm resolution visible-NIR hyperspectral dataset of grass, Grass1, containing 750 samples, which to verify the effectiveness of the model. Experiments on the Grass1 dataset confirmed that compared with the full spectrum, the time consumption of MSM was reduced by 9.471 s with guaranteed overall accuracy (OA). Comparing EAL with AL, and other classification algorithms, EAL improves OA 22.2% over AL, and XAL has the best performance value on Kappa, Macro, Recall and F1-score, respectively. Altogether, the lightweight MSM–EAL model realizes intelligent and real-time classification, providing a new method for obtaining high-precision inter group classification of grass.

show abstract

CSDS: End-to-End Aerial Scenes Classification With Depthwise Separable Convolution and an Attention Mechanism

Cited by 31 publications

References 59 publications

A lightweight and stochastic depth residual attention network for remote sensing scene classification

A lightweight and stochastic depth residual attention network for remote sensing scene classification

Remote sensing scene image classification model based on multi-scale features and attention mechanism

Visible-NIR hyperspectral classification of grass based on multivariate smooth mapping and extreme active learning approach

Contact Info

Product

Resources

About