Nowadays computer technologies are flowering especially the artificial intelligence field. It lives its prosperous years. Recently it closes the gap between humans and machines with the facilitation of supporting decisions. One of these gaps is the surveillance cameras labors' attentiveness and the lack of instantaneous detection of violence actions on the scenes of such cameras. In this paper we present an end to end deep neural network to detect the violence scenes in the surveillance cameras, the proposed system composed of set of phases. It extracts a set of selectively distributed frames of the video clip, performs spatio-temporal features, and passes them to a fully connected neural to classify the video to violence or non-violence action. The model is evaluated on different datasets; like Real Life Violence Situations aka RLVS and Hockey Fight Detection datasets. The accuracy was 92% and 94.5% respectively, which outperformed the previous related works.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.