2020
DOI: 10.1109/access.2020.2977856
|View full text |Cite
|
Sign up to set email alerts
|

A Discriminative Deep Model With Feature Fusion and Temporal Attention for Human Action Recognition

Abstract: Activity recognition which aims to accurately distinguish human actions in complex environments plays a key role in human-robot/computer interaction. However, long-lasting and similar actions will cause poor feature sequence extraction and thus lead to a reduction of the recognition accuracy. We propose a novel discriminative deep model (D3D-LSTM) based on 3D-CNN and LSTM for both single-target and interaction action recognition to improve the spatiotemporal processing performance. Our models have several nota… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
4

Relationship

3
6

Authors

Journals

citations
Cited by 35 publications
(9 citation statements)
references
References 56 publications
(61 reference statements)
0
9
0
Order By: Relevance
“…In future work, we will improve the feature extraction by introducing the multiscale features segmentation method and the bionic mechanism, such as [42], [43] and [44]. Besides, we will collect more dataset with data preprocessing, which is expected to be a benchmark dataset for the metal surface defects detection.…”
Section: Discussionmentioning
confidence: 99%
“…In future work, we will improve the feature extraction by introducing the multiscale features segmentation method and the bionic mechanism, such as [42], [43] and [44]. Besides, we will collect more dataset with data preprocessing, which is expected to be a benchmark dataset for the metal surface defects detection.…”
Section: Discussionmentioning
confidence: 99%
“…Hence, many deep learning methods are proposed for facial expression recognition and achieve state-of-the-art results. Many deep networks are very effective for key feature extraction, such as the EmotiW, the DBNs, the AUDNs, the E3D-LSTM, 3D-MM, and the D-ConvLSTM [38]- [43]. Deep learning-based models focus on the following research aspects: 1) Improve the deep model by re-designing the network structure and random weight initialization; 2) Mining more discriminative facial features; 3) Reduce the dependence of the deep model on a large number of data samples; 4) Publish datasets that have highquality labels and are collected under actual world.…”
Section: B Feature Extractionmentioning
confidence: 99%
“…DenseNet also made the network narrower, the number of parameters compared to other models was significantly reduced, and the training efficiency is improved at the same time. In 2020, J. Yu et al [27] used improved Inception ResNet layers for automatic recognition, which further improved the performance of the DCNN algorithm, but it could not merge the multilevel and multiscale features. Thus, we proposed the CFPN architecture to further improve the performance of the FER algorithm.…”
Section: Convolutional Neural Network For Classificationmentioning
confidence: 99%
“…However, end-to-end learning tasks often require a large dataset, which poses a great challenge to the field of expression recognition lacking label data; additional depth information could potentially improve the performance. These aspects will be the main focus in the future research [40], [41]. Alexnet [39] 0.243178s 0.74 Inception v2 [22] 0.565114s 0.81…”
Section: Conclusion and Feature Workmentioning
confidence: 99%