2010 23rd SIBGRAPI Conference on Graphics, Patterns and Images 2010
DOI: 10.1109/sibgrapi.2010.38
|View full text |Cite
|
Sign up to set email alerts
|

Violence Detection in Video Using Spatio-Temporal Features

Abstract: In this paper we presented a violence detector built on the concept of visual codebooks using linear support vector machines. It differs from the existing works of violence detection in what concern the data representation, as none has considered local spatio-temporal features with bags of visual words. An evaluation of the importance of local spatio-temporal features for characterizing the multimedia content is conducted through the cross-validation method. The results obtained confirm that motion patterns ar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
44
0
1

Year Published

2012
2012
2021
2021

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 109 publications
(45 citation statements)
references
References 12 publications
(18 reference statements)
0
44
0
1
Order By: Relevance
“…Frame-level audio features both from the time and the frequency domain are employed and a polynomial SVM is used as the classifier. In [8], de Souza et al adopt their own definition of violence, and designate violent scenes as those containing fights (i.e., aggressive human actions), regardless of the context and the number of people involved. Their SVM approach is based on the use of Bag-of-Words (BoW), where local Spatial-Temporal Interest Point Features (STIP) are used as feature representations.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Frame-level audio features both from the time and the frequency domain are employed and a polynomial SVM is used as the classifier. In [8], de Souza et al adopt their own definition of violence, and designate violent scenes as those containing fights (i.e., aggressive human actions), regardless of the context and the number of people involved. Their SVM approach is based on the use of Bag-of-Words (BoW), where local Spatial-Temporal Interest Point Features (STIP) are used as feature representations.…”
Section: Related Workmentioning
confidence: 99%
“…In many of the existing works, next to audio or static visual cues, motion information is also used in the detection of violent scenes. Employed motion features range from simplistic features such as motion changes, shot length, camera motion or motion intensity [5,12,20,22], to more elaborated descriptors such as STIP, ViF [8,9] or dense trajectories, which have recently enjoyed great popularity. Dense trajectory features [25] have indeed received attention even among the VSD participants (e.g., [26,27,28]).…”
Section: Addressed Research Questions and Contributions Of This Papermentioning
confidence: 99%
“…In recent studies, some approaches based on spatiotemporal interest points (STIPs) [15] have been proposed for violence detection. Generally, after extracting interest points over the frames, the Bag-of-Words (BoW) approach is used for recognizing violence.…”
Section: Introductionmentioning
confidence: 99%
“…Approaches based on local spatiotemporal descriptors are traditionally combined with models of Bag-of-Words (BoWs) and have achieved promising performance in violence detection [2], [15]. However, the conventional BoW methods rely on the discriminative power of local spatiotemporal descriptors and how often these descriptors occur in the video.…”
Section: Introductionmentioning
confidence: 99%
“…This causes researchers difficulty in terms of working on a common ground [1]. Some of the violence interpretations include violent actions by humans where there is blood [2], scenes containing gunshots, fights and explosions [3], person to person harmful acts like threatening and physical harm [4], and fighting scenes regardless of number of individuals involved and context [5,6]. These different interpretations lead to different techniques for VSD, which makes it difficult to conduct a comparative study.…”
Section: Introductionmentioning
confidence: 99%