2012
DOI: 10.1007/s00138-012-0449-x
|View full text |Cite
|
Sign up to set email alerts
|

Classifying web videos using a global video descriptor

Abstract: Computing descriptors for videos is a crucial task in computer vision. In this paper, we propose a global video descriptor for classification of videos. Our method, bypasses the detection of interest points, the extraction of local video descriptors and the quantization of descriptors into a code book; it represents each video sequence as a single feature vector. Our global descriptor is computed by applying a bank of 3-D spatiotemporal filters on the frequency spectrum of a video sequence; hence it integrates… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
46
0

Year Published

2013
2013
2018
2018

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 90 publications
(47 citation statements)
references
References 25 publications
0
46
0
Order By: Relevance
“…The comparisons to available related works are described in Table II. www.ijacsa.thesai.org [14] group-wise cross validation 57.90% Todorovic [23] 2/3 training and 1/3 testing for each class 81.03% Solmaz et al [24] Leave One Group Out Cross validation(25 cross-validations) 73.70%…”
Section: Simulation Resultsmentioning
confidence: 99%
“…The comparisons to available related works are described in Table II. www.ijacsa.thesai.org [14] group-wise cross validation 57.90% Todorovic [23] 2/3 training and 1/3 testing for each class 81.03% Solmaz et al [24] Leave One Group Out Cross validation(25 cross-validations) 73.70%…”
Section: Simulation Resultsmentioning
confidence: 99%
“…While there is a large body of literature on human action/ activity recognition, such as [25,41,48,44], the problem of recognizing human interactions is a relatively less studied topic in computer vision. Related work on human interaction recognition typically addresses one of the following two interaction types: (i) human-object interactions, and (ii) human-human interactions.…”
Section: Related Workmentioning
confidence: 99%
“…Another recently proposed video descriptor for human action recognition is Gist3D [16]. This is a global descriptor based on a 3D filter bank and describes the spatio-temporal 'gist' of a video.…”
Section: ) Spatio-temporal Detectorsmentioning
confidence: 99%
“…It should however be noted here that MBH performance comprises a complex multiple kernel combination of a horizontal MBHx and vertical MBHy component. In [16], a recognition accuracy of 73.7% is reported for a combination of Gist3D and Harris STIP + HOG/HOF descriptors. However, performance of the individual descriptors is Per-class recognition performances on UCF50 dataset.…”
Section: H Ucf50mentioning
confidence: 99%