2018
DOI: 10.1109/tnnls.2017.2740318
|View full text |Cite
|
Sign up to set email alerts
|

Deep Manifold Learning Combined With Convolutional Neural Networks for Action Recognition

Abstract: Learning deep representations have been applied in action recognition widely. However, there have been a few investigations on how to utilize the structural manifold information among different action videos to enhance the recognition accuracy and efficiency. In this paper, we propose to incorporate the manifold of training samples into deep learning, which is defined as deep manifold learning (DML). The proposed DML framework can be adapted to most existing deep networks to learn more discriminative features … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
23
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 67 publications
(23 citation statements)
references
References 46 publications
0
23
0
Order By: Relevance
“…Action recognition has attracted much attention in the past decade [22], [29]. The targets of action recognition evolute from scimple background to real-world video sequences, while the recognition methods shift from hand-crafted to learning based.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…Action recognition has attracted much attention in the past decade [22], [29]. The targets of action recognition evolute from scimple background to real-world video sequences, while the recognition methods shift from hand-crafted to learning based.…”
Section: Related Workmentioning
confidence: 99%
“…Action classification in video had been one of the most challenging problems next to the image classification [10]. Recent deep learning approaches including 3D CNN [11], two-stream CNNs [2], C3D [12], TDD [13], TSN [14], ST-ResNet+iDT [15], L 2 STM [16], ST-VLMPF [17], P3D ResNet [18], I3D [19], 3D ResNeXt [20], R(2+1)D-TwoStream [7], CO2FI+ASYN [21], and DML [22] have shown state-ofthe-art performances in action recognition. The recent development of CNNs with spatio-temporal 3D convolutional kernels (3D CNNs) rapidly grows and contributes to significant advances in video recognition [7], [18]- [20] because 3D CNNs can be used to directly extract spatio-temporal features from raw videos.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…It has resulted in structured sparse patterns 2 that can accelerate the online inference. However, such a magnitude-based measurement (e.g., 1 -norm) is too simple and inefficient to determine the importance of each filter, due to the existence of nonlinear activation functions (e.g., rectifier linear unit (ReLU) [40]) and other complex operations (e.g., pooling and batch normalization [41]). To explain, filters with small 1norm values may have large responses in the output.…”
Section: Introductionmentioning
confidence: 99%