2015
DOI: 10.1007/s11263-015-0867-0
|View full text |Cite
|
Sign up to set email alerts
|

Fusing $${\mathcal {R}}$$ R Features and Local Features with Context-Aware Kernels for Action Recognition

Abstract: The performance of action recognition in video sequences depends significantly on the representation of actions and the similarity measurement between the representations. In this paper, we combine two kinds of features extracted from the spatio-temporal interest points with context-aware kernels for action recognition. For the action representation, local cuboid features extracted around interest points are very popular using a Bag of Visual Words (BOVW) model. Such representations, however, ignore potentiall… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 56 publications
0
3
0
Order By: Relevance
“…The outcome of this study shows that it had good ability that understands an action from unseen views. In the same way the work of Yuan et al [42], the authors have used the multiple techniques which including R-features transform with the context-aware kernel learning algorithm that captures the distributed actions and to classify the similarities between the representations of activity in the video sequence. The experimental performance delivers that the proposed approach is effective for human action recognition from the captured videos.…”
Section: Existing Approachesmentioning
confidence: 99%
“…The outcome of this study shows that it had good ability that understands an action from unseen views. In the same way the work of Yuan et al [42], the authors have used the multiple techniques which including R-features transform with the context-aware kernel learning algorithm that captures the distributed actions and to classify the similarities between the representations of activity in the video sequence. The experimental performance delivers that the proposed approach is effective for human action recognition from the captured videos.…”
Section: Existing Approachesmentioning
confidence: 99%
“…In the sequential method, the temporal features such as appearance and pose are obtained from the hidden Markov model [ 54 56 ], conditional random fields [ 57 – 60 ], and structured support vector machine [ 61 64 ]. Furthermore, representative key poses are learned for efficient representation of human actions [ 33 , 34 , 65 72 ] to build a compact pose sequence.…”
Section: Introductionmentioning
confidence: 99%
“…Deep learning techniques [ 73 ] such as 2D ConvNets [ 21 , 74 ] and 3D ConvNets [ 26 ] perform feature learning via convolution operator and temporal modeling [ 75 ]. The initialization of a deep neural network [ 72 ] is crucial for training the model. To ensure that the state of the hidden layers follow a uniform distribution, a model parameter [ 76 78 ] is initialized.…”
Section: Introductionmentioning
confidence: 99%