Multi-frame feature-fusion-based model for violence detection

Asad, Mujtaba; Yang, Jie; Jiang, He; Shamsolmoali, Pourya; He, Xiangjian

doi:10.1007/s00371-020-01878-6

Cited by 46 publications

(26 citation statements)

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Li et al [36] fuse attention based spatial RGB stream with commonly used temporal and spatial streams to propose a multi-mode fusion method to detect violence. Asad et al [37] detect violence by combining a CNN and a wide-dense residual block to learn spatial features and LSTM units to learn temporal features. Ullah et al [38] use a lightweight CNN to identify frames containing persons and pass those frames to a 3D CNN for detection of abnormalities.…”

Section: B Deep Learning-based Methodsmentioning

confidence: 99%

Efficient Anomaly Detection in Crowd Videos Using Pre-Trained 2D Convolutional Neural Networks

Mehmood¹

2021

IEEE Access

View full text Add to dashboard Cite

Surveillance of crowded places can benefit from improved techniques of anomaly detection in crowd videos. Several existing methods have detected various types of crowd abnormal behaviors by using spatial and temporal information got from videos. So far as real-time detection of anomalies is concerned, special attention must be given to reducing the model complexity that leads to computational and memory loads. This paper proposes a low computational cost approach to detect crowd anomalies. The proposed approach avoids the expensive optical flow calculations by adopting a pre-trained 2D convolutional neural network (CNN) for motion information and implements a lighter form of the 2D CNN to achieve high recognition accuracy at low computational cost. Experiments on public datasets show that the proposed model outperforms the existing approaches in terms of detection accuracy alongside providing better performance in generating input frames.

show abstract

Section: B Deep Learning-based Methodsmentioning

confidence: 99%

Efficient Anomaly Detection in Crowd Videos Using Pre-Trained 2D Convolutional Neural Networks

Mehmood¹

2021

IEEE Access

View full text Add to dashboard Cite

show abstract

“…An LSTM network finally learned and classified the violent activity patterns over a period. Asad et al [13] adopted a multi-level feature fusion approach to integrate local motion patterns from an equally spaced sequence of input frames. They combined a wide-dense residual block with a 2D-CNN to learn combined features obtained from pairs of input frames.…”

Section: Deep Learning-based Methodsmentioning

confidence: 99%

“…So, for example, the acts of balancing attempts made by a person falling have much in common with the patterns commonly found in suspicious and violent behaviors. Therefore, the solutions targeting this problem mostly aim at providing inclusive methods for detecting multiple anomalies [3,10,11], customizing datasets to learn specific features of targeted behaviors [10], and using advanced techniques of learning the motion patterns [12][13][14], often by incorporating both spatial and temporal features. The second difficulty involves the computational complexity of behavior representation and detection algorithms, resulting in the high expense of computing resources, thus impeding their utilization in many real-world scenarios.…”

Section: Introductionmentioning

confidence: 99%

LightAnomalyNet: A Lightweight Framework for Efficient Abnormal Behavior Detection

Mehmood

2021

Sensors

View full text Add to dashboard Cite

The continuous development of intelligent video surveillance systems has increased the demand for enhanced vision-based methods of automated detection of anomalies within various behaviors found in video scenes. Several methods have appeared in the literature that detect different anomalies by using the details of motion features associated with different actions. To enable the efficient detection of anomalies, alongside characterizing the specificities involved in features related to each behavior, the model complexity leading to computational expense must be reduced. This paper provides a lightweight framework (LightAnomalyNet) comprising a convolutional neural network (CNN) that is trained using input frames obtained by a computationally cost-effective method. The proposed framework effectively represents and differentiates between normal and abnormal events. In particular, this work defines human falls, some kinds of suspicious behavior, and violent acts as abnormal activities, and discriminates them from other (normal) activities in surveillance videos. Experiments on public datasets show that LightAnomalyNet yields better performance comparative to the existing methods in terms of classification accuracy and input frames generation.

show abstract

“…However, the application of this approach of abnormal behavior detection is limited, which is not conducive to its popularization. Xia et al [14][15][16][17] all used deep learning to detect abnormal actions of the human body and achieved good results with strong robustness to noises in the environment. However, due to the large number of parameters in the above model, it does not consider the problem of real-time performance in small devices.…”

Section: Introductionmentioning

confidence: 99%

JRL‐YOLO: A Novel Jump‐Join Repetitious Learning Structure for Real‐Time Dangerous Object Detection

Zeng

Zhang

Zhao

et al. 2021

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

Campus security incidents occur from time to time, which seriously affect the public security. In recent years, the rapid development of artificial intelligence has brought technical support for campus intelligent security. In order to quickly recognize and locate dangerous targets on campus, an improved YOLOv3-Tiny model is proposed for dangerous target detection. Since the biggest advantage of this model is that it can achieve higher precision with very fewer parameters than YOLOv3-Tiny, it is one of the Tinier-YOLO models. In this paper, the dangerous targets include dangerous objects and dangerous actions. The main contributions of this work include the following: firstly, the detection of dangerous objects and dangerous actions is integrated into one model, and the model can achieve higher accuracy with fewer parameters. Secondly, to solve the problem of insufficient YOLOv3-Tiny target detection, a jump-join repetitious learning (JRL) structure is proposed, combined with the spatial pyramid pooling (SPP), which serves as the new backbone network of YOLOv3-Tiny and can accelerate the speed of feature extraction while integrating features of different scales. Finally, the soft-NMS and DIoU-NMS algorithm are combined to effectively reduce the missing detection when two targets are too close. Experimental tests on self-made datasets of dangerous targets show that the average MAP value of the JRL-YOLO algorithm is 85.03%, which increases by 3.22 percent compared with YOLOv3-Tiny. On the VOC2007 dataset, the proposed method has a 9.29 percent increase in detection accuracy compared to that using YOLOv3-Tiny and a 2.38 percent increase compared to that employing YOLOv4-Tiny, respectively. These results all evidence the great improvement in detection accuracy brought by the proposed method. Moreover, when testing the dataset of dangerous targets, the model size of JRL-YOLO is 5.84 M, which is about one-fifth of the size of YOLOv3-Tiny (33.1 M) and one-third of the size of YOLOv4-Tiny (22.4 M), separately.

show abstract

Multi-frame feature-fusion-based model for violence detection

Cited by 46 publications

References 42 publications

Efficient Anomaly Detection in Crowd Videos Using Pre-Trained 2D Convolutional Neural Networks

Efficient Anomaly Detection in Crowd Videos Using Pre-Trained 2D Convolutional Neural Networks

LightAnomalyNet: A Lightweight Framework for Efficient Abnormal Behavior Detection

JRL‐YOLO: A Novel Jump‐Join Repetitious Learning Structure for Real‐Time Dangerous Object Detection

Contact Info

Product

Resources

About