Group Sparse-Based Mid-Level Representation for Action Recognition

Zhang, Shiwei; Chen, Feifei; Luo, Sihui

doi:10.1109/tsmc.2016.2625840

Cited by 17 publications

(8 citation statements)

References 71 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Similar to this notion, we learn sparse dictionaries too but only to integrate them with the feature vectors and not to use them as a mode of recognition. In a recent work by Zhang et al (2017), a saliency-driven max-pooling scheme is used to represent a video, and a sparse classifier model is applied to select the discriminative parts of the person. To model complex activities in videos, Wang et al (2014) defines a locally weighted word context descriptor to improve interest-point-based representation and learn action units using the graph regularized nonnegative matrix factorization.…”

Section: Related Methodsmentioning

confidence: 99%

A sparse coded composite descriptor for human activity recognition

et al. 2021

View full text Add to dashboard Cite

This paper proposes a novel algorithm for computing discriminative descriptors named as a sparse coded composite descriptor (SCCD) for robust human activity recognition. The proposed method blends the state-of-the-art handcrafted features and the discriminative nature of the sparse representation of visual information. The human activity is firstly modelled using any handcrafted feature, and then the sparse codes computed on a discriminative sparse dictionary of these features are embedded to provide discrimination in the feature set. Finally, a support vector machine (SVM) is trained using the proposed SCCDs to perform classification of different human activities. A new feature named as differential motion descriptor (DMD) is also proposed to extract the motion as well as spatial information from an activity video. The simulation results reveal that in comparison with the handcrafted feature, the corresponding SCCD improves the recognition accuracy significantly. The proposed method is compared with state-of-the-art methods on KTH, Ballet, UCF50, and HMDB51 datasets and the proposed methodology of composite features outperforms these methods in terms of recognition accuracy.

show abstract

Section: Related Methodsmentioning

confidence: 99%

A sparse coded composite descriptor for human activity recognition

et al. 2021

View full text Add to dashboard Cite

show abstract

“…Dense sampling method has also been adopted in extracting local features. Examples included Nguyen et al [38], Liu et al [72], Hu et al [37], Iosifidis et al [85], Luo and Lu [64], and Zhang et al [36] in which they used a dense sampling method to obtain HOG, HOF as well as motion boundary. In a similar approach, Zhang et al [40] also applied dense sampling method and tracked these points to form dense trajectory feature points and then incorporated the Fisher vector encoding for feature reduction.…”

Section: Updated Reviewmentioning

confidence: 99%

“…Ding and Qu extracted SIFT features from the selected interest points and then used BoW to construct the visual word in order to build the vocabulary for the feature descriptors. A more direct approach is to simply concatenate the low‐level features such as HOG, HOF, MBH and trajectories to form feature vectors [36].…”

Section: Updated Reviewmentioning

confidence: 99%

“…Presently, the most popular approach that has shown reliable and consistent performance for classification of human action has been the SVM classifier. Due to the complexity of the problems, there have been also many works that employ different instantiations of SVM such as single‐class SVM [9, 10, 14, 16, 23, 28, 34, 36–39, 42, 45, 47, 53, 60, 63, 64, 66, 67], latent SVM [46], multiclass SVM [12, 13, 19, 26, 35, 44, 51, 57], multiclass Kernel SVM [38, 62]. Wu et al [50] applied a two‐stage classifier in their work.…”

Section: Updated Reviewmentioning

confidence: 99%

See 1 more Smart Citation

Advances in human action recognition: an updated survey

Abu-Bakar

2019

IET image process

View full text Add to dashboard Cite

Research in human activity recognition (HAR) has seen tremendous growth and continuously receiving attention from both the Computer Vision and the Image Processing communities. Due to the existence of numerous publications in this field, undoubtedly, there have been a number of review papers on this subject that categorise these techniques. Many of the recent works have started to tackle more challenging problems and these proposed techniques are addressing more realistic real‐world scenarios. Conspicuously, an updated survey that covers these methods is timely due. To simplify the categorisation, this study takes a two‐layer hierarchical approach. At the top level, the categorisation is based on the basic process flow of HAR, i.e. input data‐type, features‐type, descriptor‐type, and classifier‐type. At the second layer, each of these components is further subcategorised based on the diversity of the proposed methods. Finally, a remark on the coming popularity of deep learning approach in this field is also given.

show abstract

“…However, this kind of mapping has some shortcomings: first, the stepping characteristic is semantic and can easily be understood and cracked. Second, the calculation complexity of stepping recognition is still relatively high, although the motion analysis and tracking of a video have achieved significant progress recently [31,32]. ird, the capacity of information hiding is low since only one bit is hidden in every frame.…”

Section: Introductionmentioning

confidence: 99%

Coverless Steganography Based on Motion Analysis of Video

Tan

Qin

Xiang

et al. 2021

Security and Communication Networks

View full text Add to dashboard Cite

With the rapid development of interactive multimedia services and camera sensor networks, the number of network videos is exploding, which has formed a natural carrier library for steganography. In this study, a coverless steganography scheme based on motion analysis of video is proposed. For every video in the database, the robust histograms of oriented optical flow (RHOOF) are obtained, and the index database is constructed. The hidden information bits are mapped to the hash sequences of RHOOF, and the corresponding indexes are sent by the sender. At the receiver, through calculating hash sequences of RHOOF from the cover video, the secret information can be extracted successfully. During the whole process, the cover video remains original without any modification and has a strong ability to resist steganalysis. The capacity is investigated and shows good improvement. The robustness performance is prominent against most attacks such as pepper and salt noise, speckle noise, MPEG-4 compression, and motion JPEG 2000 compression. Compared with the existing coverless information hiding schemes based on images, the proposed method not only obtains a good trade-off between hiding information capacity and robustness but also can achieve higher hiding success rate and lower transmission data load, which shows good practicability and feasibility.

show abstract

Group Sparse-Based Mid-Level Representation for Action Recognition

Cited by 17 publications

References 71 publications

A sparse coded composite descriptor for human activity recognition

A sparse coded composite descriptor for human activity recognition

Advances in human action recognition: an updated survey

Coverless Steganography Based on Motion Analysis of Video

Contact Info

Product

Resources

About