Multiscale Information Fusion for Hyperspectral Image Classification Based on Hybrid 2D-3D CNN

Gong, Hang; Li, Qiuxia; Li, Chunlai; Dai, Haishan; He, Zhiping; Wang, Wenjing; Li, Haoyang; Han, Feng; Tuniyazi, Abudusalamu; Mu, Tingkui

doi:10.3390/rs13122268

Cited by 44 publications

(18 citation statements)

References 47 publications

(48 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…CNN. In the early days, a classic attempt to apply deep learning to RGB video was to extend 2D CNN to form a twostream architecture to obtain spatial features of video frames and motion features between frames, respectively [30][31][32]. An image is a projection from real-world 3D coordinates to 2D plane coordinates.…”

Section: Action Recognition Based On 3d Lightweight Multiscalementioning

confidence: 99%

Analytical Model of Action Fusion in Sports Tennis Teaching by Convolutional Neural Networks

Li¹,

Guo

Huang

2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

In order to improve the effectiveness of tennis teaching and enhance students’ understanding and mastery of tennis standard movements, based on the three-dimensional (3D) convolutional neural network architecture, the problem of action recognition is deeply studied. Firstly, through OpenPose, the recognition process of human poses in tennis sports videos is discussed. Athlete tracking algorithms are designed to target players. According to the target tracking data, combined with the movement characteristics of tennis, real-time semantic analysis is used to discriminate the movement types of human key point displacement in tennis. Secondly, through 2D pose estimation of tennis players, the analysis of tennis movement types is achieved. Finally, in the tennis player action recognition, a lightweight multiscale convolutional model is proposed for tennis player action recognition. Meanwhile, a key frame segment network (KFSN) for local information fusion based on keyframes is proposed. The network improves the efficiency of the whole action video learning. Through simulation experiments on the public dataset UCF101, the proposed 3DCNN-based KFSN achieves a recognition rate of 94.8%. The average time per iteration is only 1/3 of the C3D network, and the convergence speed of the model is significantly faster. The 3DCNN-based recognition method of information fusion action discussed can effectively improve the recognition effect of tennis actions and improve students’ learning and understanding of actions in the teaching process.

show abstract

Section: Action Recognition Based On 3d Lightweight Multiscalementioning

confidence: 99%

Analytical Model of Action Fusion in Sports Tennis Teaching by Convolutional Neural Networks

Li¹,

Guo

Huang

2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

show abstract

“…Their results showed that the proposed method performed better than the other HSI classification methods. In another study, Gong et al [55] proposed a multiscale squeeze-and-excitation pyramid pooling network (MSPN), and used a hybrid 2D-3D-CNN MSPN framework (which can learn and fuse deeper hierarchical spatial-spectral features with fewer training samples). The results demonstrated that a 97.31% classification accuracy was obtained based on the proposed method using only 0.1% of the training samples in their work.…”

Section: Existing Deficiencies and Future Prospectsmentioning

confidence: 99%

Three-Dimensional Convolutional Neural Network Model for Early Detection of Pine Wilt Disease Using UAV-Based Hyperspectral Images

Luo

et al. 2021

Remote Sensing

View full text Add to dashboard Cite

As one of the most devastating disasters to pine forests, pine wilt disease (PWD) has caused tremendous ecological and economic losses in China. An effective way to prevent large-scale PWD outbreaks is to detect and remove the damaged pine trees at the early stage of PWD infection. However, early infected pine trees do not show obvious changes in morphology or color in the visible wavelength range, making early detection of PWD tricky. Unmanned aerial vehicle (UAV)-based hyperspectral imagery (HI) has great potential for early detection of PWD. However, the commonly used methods, such as the two-dimensional convolutional neural network (2D-CNN), fail to simultaneously extract and fully utilize the spatial and spectral information, whereas the three-dimensional convolutional neural network (3D-CNN) is able to collect this information from raw hyperspectral data. In this paper, we applied the residual block to 3D-CNN and constructed a 3D-Res CNN model, the performance of which was then compared with that of 3D-CNN, 2D-CNN, and 2D-Res CNN in identifying PWD-infected pine trees from the hyperspectral images. The 3D-Res CNN model outperformed the other models, achieving an overall accuracy (OA) of 88.11% and an accuracy of 72.86% for detecting early infected pine trees (EIPs). Using only 20% of the training samples, the OA and EIP accuracy of 3D-Res CNN can still achieve 81.06% and 51.97%, which is superior to the state-of-the-art method in the early detection of PWD based on hyperspectral images. Collectively, 3D-Res CNN was more accurate and effective in early detection of PWD. In conclusion, 3D-Res CNN is proposed for early detection of PWD in this paper, making the prediction and control of PWD more accurate and effective. This model can also be applied to detect pine trees damaged by other diseases or insect pests in the forest.

show abstract

“…Each branch contains two modules, namely the pyramidal spectral block (the spectral attention) and the pyramidal spatial block (the spatial attention). To solve the limitation that the pyramidal convolutional layer has a single-size receptive field, Gong et al proposed a pyramid pooling module, which can aggregate multiple receptive fields of different scales and obtain more discriminative spatial context information [53]. The pyramid pooling module is mainly implemented by average pooling layers of different sizes, and then the feature map is restored to the original image size through deconvolution.…”

Section: Pyramidal Network Structurementioning

confidence: 99%

Densely Connected Pyramidal Dilated Convolutional Network for Hyperspectral Image Classification

et al. 2021

View full text Add to dashboard Cite

Recently, with the extensive application of deep learning techniques in the hyperspectral image (HSI) field, particularly convolutional neural network (CNN), the research of HSI classification has stepped into a new stage. To avoid the problem that the receptive field of naive convolution is small, the dilated convolution is introduced into the field of HSI classification. However, the dilated convolution usually generates blind spots in the receptive field, resulting in discontinuous spatial information obtained. In order to solve the above problem, a densely connected pyramidal dilated convolutional network (PDCNet) is proposed in this paper. Firstly, a pyramidal dilated convolutional (PDC) layer integrates different numbers of sub-dilated convolutional layers is proposed, where the dilated factor of the sub-dilated convolution increases exponentially, achieving multi-sacle receptive fields. Secondly, the number of sub-dilated convolutional layers increases in a pyramidal pattern with the depth of the network, thereby capturing more comprehensive hyperspectral information in the receptive field. Furthermore, a feature fusion mechanism combining pixel-by-pixel addition and channel stacking is adopted to extract more abstract spectral–spatial features. Finally, in order to reuse the features of the previous layers more effectively, dense connections are applied in densely pyramidal dilated convolutional (DPDC) blocks. Experiments on three well-known HSI datasets indicate that PDCNet proposed in this paper has good classification performance compared with other popular models.

show abstract

Multiscale Information Fusion for Hyperspectral Image Classification Based on Hybrid 2D-3D CNN

Cited by 44 publications

References 47 publications

Analytical Model of Action Fusion in Sports Tennis Teaching by Convolutional Neural Networks

Analytical Model of Action Fusion in Sports Tennis Teaching by Convolutional Neural Networks

Three-Dimensional Convolutional Neural Network Model for Early Detection of Pine Wilt Disease Using UAV-Based Hyperspectral Images

Densely Connected Pyramidal Dilated Convolutional Network for Hyperspectral Image Classification

Contact Info

Product

Resources

About