Multi‐view human action recognition using 2D motion templates based on MHIs and their HOG description

Murtaza, Fiza; Yousaf, Muhammad Haroon; Velastín, Sergio A.

doi:10.1049/iet-cvi.2015.0416

Cited by 57 publications

(56 citation statements)

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Experiments show that PMHI outperforms the recall rate of recent methods on the MuHAVi-uncut [20] dataset as well as the CVPR 2012 Change Detection dataset [21]. The MuHAVi-uncut is relatively new [22] and thus, to our knowledge, it has not yet been used for temporal segmentation purposes. Therefore, our work can also be used as a baseline for the temporal segmentation of the MuHAVi-uncut dataset.…”

Section: Non-actionmentioning

confidence: 99%

See 1 more Smart Citation

PMHI: Proposals From Motion History Images for Temporal Segmentation of Long Uncut Videos

Murtaza

Yousaf

Velastín

2018

IEEE Signal Process. Lett.

Self Cite

View full text Add to dashboard Cite

Section: Non-actionmentioning

confidence: 99%

“…MuHAVi-uncut is a dataset of long RGB video recordings (8 cameras) of people doing prescribed actions [22]. The dataset provides a set of silhouettes obtained by a good but not perfect foreground estimation algorithm.…”

Section: A Datasets and Evaluation Measurementioning

confidence: 99%

PMHI: Proposals From Motion History Images for Temporal Segmentation of Long Uncut Videos

Murtaza

Yousaf

Velastín

2018

IEEE Signal Process. Lett.

Self Cite

View full text Add to dashboard Cite

“…Then they utilized histogram of the dominant angle and intensity to represent each sequence and concatenated histograms of all views as the final feature of multiview sequences. Murtaza et al [21] developed a silhouette-based view-independent action recognition scheme. They computed Motion History Images (MHI) for each view and employed Histograms of Oriented Gradients (HOG) to extract low-dimensional description of them.…”

Section: Related Workmentioning

confidence: 99%

“…Since our method takes humancentered subvolumes recorded from multiple views as input, , , } ∈cub( ),V=1: , =1: ; Predefined parameters dep max , ∈ (0, 1), ∈ (0, 1), and ∈ (0, 1); Output: Decision tree Tree , ; (1) Build a bootstrap dataset , by random sampling from with replacement; (2) Create a root node and set its depth to 1, then assign all cuboids in , to it; (3) Initialize an unsettled node queue Υ = 0 and push the root node into Υ; (4) while Υ ̸ = 0 do (5) Pop the first node in Υ; (6) if depth of is larger than dep max or cuboids assigned to belong to the same action and position then (7) Label node as a leaf, and then calculate P and Q from cuboids at node ; (8) Add a triple ( , , ) into decision tree Tree , ; (9) else (10) Initialize the feature candidate set Δ = 0; (11) if random number < then (12) Add a set of randomly selected optical flow features to Δ; (13) else (14) Add a set of randomly selected HOG3D features to Δ; (15) end if (16) if random number < then (17) Add two-dimensional temporal context features to Δ; (18) end if (19) maxgain = −∞, generate a random number ; (20) for each ∈ Δ do (21) if < then (22) Search for the corresponding threshold and compute information gain ( ) in terms of action labels of cuboids arriving at ; (23) else (24) Search for the corresponding threshold and compute information gain ( ) in terms of positions of cuboids arriving at ; (25) end if (26) if ( ) > maxgain then (27) * = , * = ; (28) end if (29) end for (30) Create left children node and right children node , set their depth to dep + 1, and assign each cuboid arriving at to or according to * and * ; then push node * and * into Υ; (31) Add a quintuple ( , , , * , * ) into decision tree Tree , ; (32) end if (33) end while (34) return Decision tree Tree , ; Algorithm 1: Construction of a decision tree.…”

Section: Experimental Settingmentioning

confidence: 99%

Learning a Mid-Level Representation for Multiview Action Recognition

Liu

Shi

et al. 2018

Advances in Multimedia

View full text Add to dashboard Cite

Recognizing human actions in videos is an active topic with broad commercial potentials. Most of the existing action recognition methods are supposed to have the same camera view during both training and testing. And thus performances of these single-view approaches may be severely influenced by the camera movement and variation of viewpoints. In this paper, we address the above problem by utilizing videos simultaneously recorded from multiple views. To this end, we propose a learning framework based on multitask random forest to exploit a discriminative mid-level representation for videos from multiple cameras. In the first step, subvolumes of continuous human-centered figures are extracted from original videos. In the next step, spatiotemporal cuboids sampled from these subvolumes are characterized by multiple low-level descriptors. Then a set of multitask random forests are built upon multiview cuboids sampled at adjacent positions and construct an integrated mid-level representation for multiview subvolumes of one action. Finally, a random forest classifier is employed to predict the action category in terms of the learned representation. Experiments conducted on the multiview IXMAS action dataset illustrate that the proposed method can effectively recognize human actions depicted in multiview videos.

show abstract

“…This motion descriptor is based on motion direction and histogram of motion intensity followed by the support vector machine for classification. Another method based on 2D motion templates using motion history images and histogram of oriented gradients was proposed in [94]. In [50], action recognition method was proposed based on the key elements of motion encoding and local changes in motion direction encoded with the bag-of-words technique.…”

Section: Motion-based Approachesmentioning

confidence: 99%

A Comprehensive Review on Handcrafted and Learning-Based Action Representation Approaches for Human Activity Recognition

2017

View full text Add to dashboard Cite

Human activity recognition (HAR) is an important research area in the fields of human perception and computer vision due to its wide range of applications. These applications include: intelligent video surveillance, ambient assisted living, human computer interaction, human-robot interaction, entertainment, and intelligent driving. Recently, with the emergence and successful deployment of deep learning techniques for image classification, researchers have migrated from traditional handcrafting to deep learning techniques for HAR. However, handcrafted representation-based approaches are still widely used due to some bottlenecks such as computational complexity of deep learning techniques for activity recognition. However, approaches based on handcrafted representation are not able to handle complex scenarios due to their limitations and incapability; therefore, resorting to deep learning-based techniques is a natural option. This review paper presents a comprehensive survey of both handcrafted and learning-based action representations, offering comparison, analysis, and discussions on these approaches. In addition to this, the well-known public datasets available for experimentations and important applications of HAR are also presented to provide further insight into the field. This is the first review paper of its kind which presents all these aspects of HAR in a single review article with comprehensive coverage of each part. Finally, the paper is concluded with important discussions and research directions in the domain of HAR.

show abstract

Multi‐view human action recognition using 2D motion templates based on MHIs and their HOG description

Cited by 57 publications

References 47 publications

PMHI: Proposals From Motion History Images for Temporal Segmentation of Long Uncut Videos

PMHI: Proposals From Motion History Images for Temporal Segmentation of Long Uncut Videos

Learning a Mid-Level Representation for Multiview Action Recognition

A Comprehensive Review on Handcrafted and Learning-Based Action Representation Approaches for Human Activity Recognition

Contact Info

Product

Resources

About