2017
DOI: 10.1109/tsmc.2016.2625840
|View full text |Cite
|
Sign up to set email alerts
|

Group Sparse-Based Mid-Level Representation for Action Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
7
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(8 citation statements)
references
References 71 publications
1
7
0
Order By: Relevance
“…Similar to this notion, we learn sparse dictionaries too but only to integrate them with the feature vectors and not to use them as a mode of recognition. In a recent work by Zhang et al (2017), a saliency-driven max-pooling scheme is used to represent a video, and a sparse classifier model is applied to select the discriminative parts of the person. To model complex activities in videos, Wang et al (2014) defines a locally weighted word context descriptor to improve interest-point-based representation and learn action units using the graph regularized nonnegative matrix factorization.…”
Section: Related Methodsmentioning
confidence: 99%
“…Similar to this notion, we learn sparse dictionaries too but only to integrate them with the feature vectors and not to use them as a mode of recognition. In a recent work by Zhang et al (2017), a saliency-driven max-pooling scheme is used to represent a video, and a sparse classifier model is applied to select the discriminative parts of the person. To model complex activities in videos, Wang et al (2014) defines a locally weighted word context descriptor to improve interest-point-based representation and learn action units using the graph regularized nonnegative matrix factorization.…”
Section: Related Methodsmentioning
confidence: 99%
“…Dense sampling method has also been adopted in extracting local features. Examples included Nguyen et al [38], Liu et al [72], Hu et al [37], Iosifidis et al [85], Luo and Lu [64], and Zhang et al [36] in which they used a dense sampling method to obtain HOG, HOF as well as motion boundary. In a similar approach, Zhang et al [40] also applied dense sampling method and tracked these points to form dense trajectory feature points and then incorporated the Fisher vector encoding for feature reduction.…”
Section: Updated Reviewmentioning
confidence: 99%
“…Ding and Qu extracted SIFT features from the selected interest points and then used BoW to construct the visual word in order to build the vocabulary for the feature descriptors. A more direct approach is to simply concatenate the low‐level features such as HOG, HOF, MBH and trajectories to form feature vectors [36].…”
Section: Updated Reviewmentioning
confidence: 99%
See 1 more Smart Citation
“…However, this kind of mapping has some shortcomings: first, the stepping characteristic is semantic and can easily be understood and cracked. Second, the calculation complexity of stepping recognition is still relatively high, although the motion analysis and tracking of a video have achieved significant progress recently [31,32]. ird, the capacity of information hiding is low since only one bit is hidden in every frame.…”
Section: Introductionmentioning
confidence: 99%