Convolutional neural networks (CNNs) have exhibited excellent performance in hyperspectral image classification. However, due to the lack of labeled hyperspectral data, it is difficult to achieve high classification accuracy of hyperspectral images with fewer training samples. In addition, although some deep learning techniques have been used in hyperspectral image classification, due to the abundant information of hyperspectral images, the problem of insufficient spatial spectral feature extraction still exists. To address the aforementioned issues, a spectral–spatial attention fusion with a deformable convolution residual network (SSAF-DCR) is proposed for hyperspectral image classification. The proposed network is composed of three parts, and each part is connected sequentially to extract features. In the first part, a dense spectral block is utilized to reuse spectral features as much as possible, and a spectral attention block that can refine and optimize the spectral features follows. In the second part, spatial features are extracted and selected by a dense spatial block and attention block, respectively. Then, the results of the first two parts are fused and sent to the third part, and deep spatial features are extracted by the DCR block. The above three parts realize the effective extraction of spectral–spatial features, and the experimental results for four commonly used hyperspectral datasets demonstrate that the proposed SSAF-DCR method is superior to some state-of-the-art methods with very few training samples.
In recent years, due to its powerful feature extraction ability, the deep learning method has been widely used in hyperspectral image classification tasks. However, the features extracted by classical deep learning methods have limited discrimination ability, resulting in unsatisfactory classification performance. In addition, due to the limited data samples of hyperspectral images (HSIs), how to achieve high classification performance under limited samples is also a research hotspot. In order to solve the above problems, this paper proposes a deep learning network framework named the three-dimensional coordination attention mechanism network (3DCAMNet). In this paper, a three-dimensional coordination attention mechanism (3DCAM) is designed. This attention mechanism can not only obtain the long-distance dependence of the spatial position of HSIs in the vertical and horizontal directions, but also obtain the difference of importance between different spectral bands. In order to extract the spectral and spatial information of HSIs more fully, a convolution module based on convolutional neural network (CNN) is adopted in this paper. In addition, the linear module is introduced after the convolution module, which can extract more fine advanced features. In order to verify the effectiveness of 3DCAMNet, a series of experiments were carried out on five datasets, namely, Indian Pines (IP), Pavia University (UP), Kennedy Space Center (KSC), Salinas Valley (SV), and University of Houston (HT). The OAs obtained by the proposed method on the five datasets were 95.81%, 97.01%, 99.01%, 97.48%, and 97.69% respectively, 3.71%, 9.56%, 0.67%, 2.89% and 0.11% higher than the most advanced A2S2K-ResNet. Experimental results show that, compared with some state-of-the-art methods, 3DCAMNet not only has higher classification performance, but also has stronger robustness.
In the past, convolutional neural network (CNN) has become one of the most popular deep learning frameworks, and has been widely used in Hyperspectral image classification tasks. Convolution (Conv) in CNN uses filter weights to extract features in local receiving domain, and the weight parameters are shared globally, which more focus on the high‐frequency information of the image. Different from Conv, Transformer can obtain the long‐term dependence between long‐distance features through modelling, and adaptively focus on different regions. In addition, Transformer is considered as a low‐pass filter, which more focuses on the low‐frequency information of the image. Considering the complementary characteristics of Conv and Transformer, the two modes can be integrated for full feature extraction. In addition, the most important image features correspond to the discrimination region, while the secondary image features represent important but easily ignored regions, which are also conducive to the classification of HSIs. In this study, a complementary integrated Transformer network (CITNet) for hyperspectral image classification is proposed. Firstly, three‐dimensional convolution (Conv3D) and two‐dimensional convolution (Conv2D) are utilised to extract the shallow semantic information of the image. In order to enhance the secondary features, a channel Gaussian modulation attention module is proposed, which is embedded between Conv3D and Conv2D. This module can not only enhance secondary features, but suppress the most important and least important features. Then, considering the different and complementary characteristics of Conv and Transformer, a complementary integrated Transformer module is designed. Finally, through a large number of experiments, this study evaluates the classification performance of CITNet and several state‐of‐the‐art networks on five common datasets. The experimental results show that compared with these classification networks, CITNet can provide better classification performance.
Convolutional neural networks (CNNs) have been widely used in hyperspectral image classification in recent years. The training of CNNs relies on a large amount of labeled sample data. However, the number of labeled samples of hyperspectral data is relatively small. Moreover, for hyperspectral images, fully extracting spectral and spatial feature information is the key to achieve high classification performance. To solve the above issues, a deep spectral spatial inverted residuals network (DSSIRNet) is proposed. In this network, a data block random erasing strategy is introduced to alleviate the problem of limited labeled samples by data augmentation of small spatial blocks. In addition, a deep inverted residuals (DIR) module for spectral spatial feature extraction is proposed, which locks the effective features of each layer while avoiding network degradation. Furthermore, a global 3D attention module is proposed, which can realize the fine extraction of spectral and spatial global context information under the condition of the same number of input and output feature maps. Experiments are carried out on four commonly used hyperspectral datasets. A large number of experimental results show that compared with some state-of-the-art classification methods, the proposed method can provide higher classification accuracy for hyperspectral images.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.