We describe a novel method for directing the attention of an automated surveillance system. Our starting premise is that the attention of people in a scene can be used as an indicator of interesting areas and events. To determine people's attention from passive visual observations we develop a system for automatic tracking and detection of individual heads to infer their gaze direction. The former is achieved by combining a histogram of oriented gradient (HOG) based head detector with frame-to-frame tracking using multiple point features to provide stable head images. The latter is achieved using a head pose classification method which uses randomised ferns with decision branches based on both HOG and colour based features to determine a coarse gaze direction for each person in the scene. By building both static and temporally varying maps of areas where people look we are able to identify interesting regions.
This paper presents an algorithm for the classification of head pose in low resolution video. Invariance to skin, hair and background colours is achieved by classifying using an ensemble of randomised ferns which have been trained on labelled images. The ferns are used to simultaneously classify the head pose and to identify the most likely hypothesis for the mapping between colours and labels. Results from video sequences demonstrate that an improved posterior estimation using learnt colour distributions reduces classification error and provides accurate pose information in images where the head occupies as little as 10 pixels square.
Abstract-We describe an architecture for a multi-camera, multi-resolution surveillance system. The aim is to support a set of distributed static and pan-tilt-zoom (PTZ) cameras and visual tracking algorithms, together with a central supervisor unit. Each camera (and possibly pan-tilt device) has a dedicated process and processor. Asynchronous interprocess communications and archiving of data are achieved in a simple and effective way via a central repository, implemented using an SQL database.Visual tracking data from static views are stored dynamically into tables in the database via client calls to the SQL server. A supervisor process running on the SQL server determines if active zoom cameras should be dispatched to observe a particular target, and this message is effected via writing demands into another database table.We show results from a real implementation of the system comprising one static camera overviewing the environment under consideration and a PTZ camera operating under closed-loop velocity control, which uses a fast and robust level-set-based region tracker. Experiments demonstrate the effectiveness of our approach and its feasibility to multi-camera systems for intelligent surveillance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.