Identifying events using surveillance videos is a major source that reduces crimes and illegal activities. Specifically, abnormal event detection gains more attention so that immediate responses can be provided. Video processing using conventional techniques identifies the events but fails to categorize them. Recently deep learning-based video processing applications provide excellent performances however the architecture considers either spatial or temporal features for event detection. To enhance the detection rate and classification accuracy in abnormal event detection from video keyframes, it is essential to consider both spatial and temporal features. Earlier approaches consider any one of the features from keyframes to detect the anomalies from video frames. However, the results are not accurate and prone to errors sometimes due to video environmental and other factors. Thus, two-stream hybrid deep learning architecture is presented to handle spatial and temporal features in the video anomaly detection process to attain enhanced detection performances. The proposed hybrid models extract spatial features using YOLO-V4 with VGG-16, and temporal features using optical FlowNet with VGG-16. The extracted features are fused and classified using hybrid CNN-LSTM model. Experimentation using benchmark UCF crime dataset validates the proposed model performances over existing anomaly detection methods. The proposed model attains maximum accuracy of 95.6% which indicates better performance compared to state-of-the-art techniques.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.