2018
DOI: 10.1007/s10489-018-1347-3
|View full text |Cite
|
Sign up to set email alerts
|

Learning multi-temporal-scale deep information for action recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(11 citation statements)
references
References 33 publications
0
11
0
Order By: Relevance
“…On this dataset, the proposed method (Inception-Resnet-v2 plus WVLGTP) greatly outperforms the VLBP [1], MBP [2], ML-HDP [25], two-stream CNN [10], and multi-resolution CNN [12] by 16.5%, 13.7%, 5.6%, 6.9%, and 30.4%, respectively. In contrast, our approach slightly beats the TDD [31], TC3D [32], Res3D [33], ActionVLAD [35], and Sequential VLAD [36] since these approaches also achieved more discriminative power by considering the deep features and motion feature with CNN. Furthermore, ATW CNN [34] shows almost similar accuracy with our approach, since their approach incorporates the temporal attention with CNN.…”
Section: Methodsmentioning
confidence: 93%
See 2 more Smart Citations
“…On this dataset, the proposed method (Inception-Resnet-v2 plus WVLGTP) greatly outperforms the VLBP [1], MBP [2], ML-HDP [25], two-stream CNN [10], and multi-resolution CNN [12] by 16.5%, 13.7%, 5.6%, 6.9%, and 30.4%, respectively. In contrast, our approach slightly beats the TDD [31], TC3D [32], Res3D [33], ActionVLAD [35], and Sequential VLAD [36] since these approaches also achieved more discriminative power by considering the deep features and motion feature with CNN. Furthermore, ATW CNN [34] shows almost similar accuracy with our approach, since their approach incorporates the temporal attention with CNN.…”
Section: Methodsmentioning
confidence: 93%
“…Similarly, WVLGTP shows competitive performance with Dense Trajectories [27], iDT [9], and Line Pooling [37]. However, TDD [31], Res3D [33], Action VLAD [35], and Sequential VLAD [36] show better accuracy than the proposed WVLGTP due to their discriminative power while employing a large number of action categories. On this dataset, the proposed method (Inception-Resnet-v2 plus WVLGTP) greatly outperforms the VLBP [1], MBP [2], ML-HDP [25], two-stream CNN [10], and multi-resolution CNN [12] by 16.5%, 13.7%, 5.6%, 6.9%, and 30.4%, respectively.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Using 3D convolutional network to carry out the above operations will lead to a significant increase in time and space complexity. In order to effectively solve the above problems, the convolutional neural network is comprehensively improved [7,8]. After the improvement, multiple image features can be input at the same time, and the convolutional kernel is two-dimensional, the overall complexity of the algorithm is also effectively reduced, and the computational efficiency is greatly improved [9].…”
Section: Human Movement Recognitionmentioning
confidence: 99%
“…Every atomic action was denoted as a composite latent state consisted by a latent semantic attribute and a latent geometric attribute. In their work, hidden markov model (HMM) with AdaBoost, dynamic temporal warping, and recur- Yao et al [17] studied parallel pair discriminant correlation analysis (PPDCA) to fuse the multi-temporal-scale information with a lower dimension. However, the multitemporal-scale in this method means features related to different numbers of frames.…”
Section: Related Workmentioning
confidence: 99%