2016
DOI: 10.1049/iet-cvi.2015.0416
|View full text |Cite
|
Sign up to set email alerts
|

Multi‐view human action recognition using 2D motion templates based on MHIs and their HOG description

Abstract: In this paper, a new multi-view human action recognition approach is proposed by exploiting low dimensional motion information of actions. Before feature extraction, pre-processing steps are performed to remove noise from silhouettes, incurred due to imperfect but realistic segmentation. 2D motion templates based on Motion History Image (MHI) are computed for each view/action video which can cope with the high-dimensionality issue, incurred due to multi-camera data. Histograms of Oriented Gradients (HOGs) are … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
56
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 57 publications
(56 citation statements)
references
References 47 publications
0
56
0
Order By: Relevance
“…Experiments show that PMHI outperforms the recall rate of recent methods on the MuHAVi-uncut [20] dataset as well as the CVPR 2012 Change Detection dataset [21]. The MuHAVi-uncut is relatively new [22] and thus, to our knowledge, it has not yet been used for temporal segmentation purposes. Therefore, our work can also be used as a baseline for the temporal segmentation of the MuHAVi-uncut dataset.…”
Section: Non-actionmentioning
confidence: 99%
See 1 more Smart Citation
“…Experiments show that PMHI outperforms the recall rate of recent methods on the MuHAVi-uncut [20] dataset as well as the CVPR 2012 Change Detection dataset [21]. The MuHAVi-uncut is relatively new [22] and thus, to our knowledge, it has not yet been used for temporal segmentation purposes. Therefore, our work can also be used as a baseline for the temporal segmentation of the MuHAVi-uncut dataset.…”
Section: Non-actionmentioning
confidence: 99%
“…MuHAVi-uncut is a dataset of long RGB video recordings (8 cameras) of people doing prescribed actions [22]. The dataset provides a set of silhouettes obtained by a good but not perfect foreground estimation algorithm.…”
Section: A Datasets and Evaluation Measurementioning
confidence: 99%
“…Then they utilized histogram of the dominant angle and intensity to represent each sequence and concatenated histograms of all views as the final feature of multiview sequences. Murtaza et al [21] developed a silhouette-based view-independent action recognition scheme. They computed Motion History Images (MHI) for each view and employed Histograms of Oriented Gradients (HOG) to extract low-dimensional description of them.…”
Section: Related Workmentioning
confidence: 99%
“…Since our method takes humancentered subvolumes recorded from multiple views as input, , , } ∈cub( ),V=1: , =1: ; Predefined parameters dep max , ∈ (0, 1), ∈ (0, 1), and ∈ (0, 1); Output: Decision tree Tree , ; (1) Build a bootstrap dataset , by random sampling from with replacement; (2) Create a root node and set its depth to 1, then assign all cuboids in , to it; (3) Initialize an unsettled node queue Υ = 0 and push the root node into Υ; (4) while Υ ̸ = 0 do (5) Pop the first node in Υ; (6) if depth of is larger than dep max or cuboids assigned to belong to the same action and position then (7) Label node as a leaf, and then calculate P and Q from cuboids at node ; (8) Add a triple ( , , ) into decision tree Tree , ; (9) else (10) Initialize the feature candidate set Δ = 0; (11) if random number < then (12) Add a set of randomly selected optical flow features to Δ; (13) else (14) Add a set of randomly selected HOG3D features to Δ; (15) end if (16) if random number < then (17) Add two-dimensional temporal context features to Δ; (18) end if (19) maxgain = −∞, generate a random number ; (20) for each ∈ Δ do (21) if < then (22) Search for the corresponding threshold and compute information gain ( ) in terms of action labels of cuboids arriving at ; (23) else (24) Search for the corresponding threshold and compute information gain ( ) in terms of positions of cuboids arriving at ; (25) end if (26) if ( ) > maxgain then (27) * = , * = ; (28) end if (29) end for (30) Create left children node and right children node , set their depth to dep + 1, and assign each cuboid arriving at to or according to * and * ; then push node * and * into Υ; (31) Add a quintuple ( , , , * , * ) into decision tree Tree , ; (32) end if (33) end while (34) return Decision tree Tree , ; Algorithm 1: Construction of a decision tree.…”
Section: Experimental Settingmentioning
confidence: 99%
“…This motion descriptor is based on motion direction and histogram of motion intensity followed by the support vector machine for classification. Another method based on 2D motion templates using motion history images and histogram of oriented gradients was proposed in [94]. In [50], action recognition method was proposed based on the key elements of motion encoding and local changes in motion direction encoded with the bag-of-words technique.…”
Section: Motion-based Approachesmentioning
confidence: 99%