It is very important to automatically detect violent behaviors in video surveillance scenarios, for instance, railway stations, gymnasiums and psychiatric centers. However, the previous detection methods usually extract descriptors around the spatiotemporal interesting points or extract statistic features in the motion regions, leading to limited abilities to effectively detect video-based violence activities. To address this issue, we propose a novel method to detect violence sequences. Firstly, the motion regions are segmented according to the distribution of optical flow fields. Secondly, in the motion regions, we propose to extract two kinds of low-level features to represent the appearance and dynamics for violent behaviors. The proposed low-level features are the Local Histogram of Oriented Gradient (LHOG) descriptor extracted from RGB images and the Local Histogram of Optical Flow (LHOF) descriptor extracted from optical flow images. Thirdly, the extracted features are coded using Bag of Words (BoW) model to eliminate redundant information and a specific-length vector is obtained for each video clip. At last, the video-level vectors are classified by Support Vector Machine (SVM). Experimental results on three challenging benchmark datasets demonstrate that the proposed detection approach is superior to the previous methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.