Anomaly detection in surveillance videos is currently a challenge because of the diversity of possible events. We propose a deep convolutional neural network (CNN) that addresses this problem by learning a correspondence between common object appearances (e.g. pedestrian, background, tree, etc.) and their associated motions. Our model is designed as a combination of a reconstruction network and an image translation model that share the same encoder. The former sub-network determines the most significant structures that appear in video frames and the latter one attempts to associate motion templates to such structures. The training stage is performed using only videos of normal events and the model is then capable to estimate frame-level scores for an unknown input. The experiments on 6 benchmark datasets demonstrate the competitive performance of the proposed approach with respect to state-ofthe-art methods.
AbstractThis supplementary material provides these contents:• ROC curves of our frame-level scores on the CUHK Avenue and UCSD Ped2 datasets, and Precision-Recall (PR) curves on the traffic datasets.
Human gait analysis plays an important role in musculoskeletal disorder diagnosis. Detecting anomalies in human walking, such as shuffling gait, stiff leg or unsteady gait, can be difficult if the prior knowledge of such a gait pattern is not available. We propose an approach for detecting abnormal human gait based on a normal gait model. Instead of employing the color image, silhouette, or spatio-temporal volume, our model is created based on human joint positions (skeleton) in time series. We decompose each sequence of normal gait images into gait cycles. Each human instant posture is represented by a feature vector which describes relationships between pairs of bone joints located in the lower body. Such vectors are then converted into codewords using a clustering technique. The normal human gait model is created based on multiple sequences of codewords corresponding to different gait cycles. In the detection stage, a gait cycle with normality likelihood below a threshold, which is determined automatically in the training step, is assumed as an anomaly. The experimental results on both marker-based mocap data and Kinect skeleton show that our method is very promising in distinguishing normal and abnormal gaits with an overall accuracy of 90.12%.
This paper presents an initial work on assessment of gait normality in which the human body motion is represented by a sequence of enhanced depth maps. The input data is provided by a system consisting of a Time-of-Flight (ToF) depth camera and two mirrors. This approach proposes two feature types to describe characteristics of localized points of interest and the level of posture symmetry. These two features are processed on a sequence of enhanced depth maps with the support of a sliding window to provide two corresponding scores. The gait assessment is finally performed based on a weighted combination of these two scores. The evaluation is performed by experimenting on 6 simulated abnormal gaits.
This paper proposes an approach estimating a gait abnormality index based on skeletal information provided by a depth camera. Differently from related works where the extraction of hand-crafted features is required to describe gait characteristics, our method automatically performs that stage with the support of a deep auto-encoder. In order to get visually interpretable features, we embedded a constraint of sparsity into the model. Similarly to most gait-related studies, the temporal factor is also considered as a post-processing in our system. This method provided promising results when experimenting on a dataset containing nearly one hundred thousand skeleton samples.
Computers are widely used in all fields. However, the interaction between human and machine is done mainly through the traditional input devices like mouse, keyboard etc. To satisfy the requirements of users, computers need other ways to interact more conveniently, such as using speech or body language (e.g. gestures, posture). In this paper, we propose a new method supporting hand gesture recognition in the static form, using artificial neural network. The proposed solution has been tested with high accuracy (98%) and is promising.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.