Camera network systems generate large volumes of potentially useful data, but extracting value from multiple, related videos can be a daunting task for a human reviewer. Multicamera video summarization seeks to make this task more tractable by generating a reduced set of output summary videos that concisely capture important portions of the input set. We present a system that approaches summarization at the level of detected activity motifs and shortens the input videos by compacting the representation of individual activities. Additionally, redundancy is removed across camera views by omitting from the summary activity occurrences that can be predicted by other occurrences. The system also detects anomalous events within a unified framework and can highlight them in the summary. Our contributions are a method for selecting useful parts of an activity to present to a viewer using activity motifs and a novel framework to score the importance of activity occurrences and allow transfer of importance between temporally related activities without solving the correspondence problem. We provide summarization results for a two camera network, an eleven camera network, and data from PETS 2001. We also include results from Amazon Mechanical Turk human experiments to evaluate how our visualization decisions affect task performance.
We introduce UCSB's Visual Sensor Network (VISNET) and discuss current research being conducted with the system. VISNET is a ten-node experimental camera network at UCSB used for various vision-related research. The mission of VIS-NET is to provide an easy-to-use multi-node camera network to the vision research community at UCSB. This paper briefly discusses design and setup considerations before discussing current research. Current research includes operation visualization, camera network calibration, tracked object modeling, and multiple object / multiple camera tracking.
Outdoor surveillance cameras have become prevalent as part of the urban infrastructure, and provided a good data source for studying urban dynamics. In this work, we provide a spatial-temporal analysis of 8 weeks of video data collected from the large outdoor camera network at UCSB campus, which consists of 27 cameras. We first apply simple vision algorithm to extract the crowdedness information in the scene. Then we further explore the relationship between the traffic pattern observed from the cameras with activities in the nearby area using additional knowledge such as campus class schedule. Finally we investigate the potential of discovering aggregated human movement pattern by assuming a simple probabilistic model. Experiment has shown promising results using the proposed method.
We describe a wide area camera network on a campus setting, the SCALLOPSNet (Scalable Large Optical Sensor Network). It covers with about 100 stationary cameras an expansive area that can be divided into three distinct regions: inside a building, along urban paths, and in a remote natural reserve. Some of these regions lack connections for power and communications, and, therefore, necessitate wireless, battery-powered camera nodes. In our exploration of available solutions, we found existing smart cameras to be insufficient for this task, and instead designed our own battery-powered camera nodes that communicate using 802.11b. The camera network uses the Internet Protocol on either wired or wireless networks to communicate with our central cluster, which runs cluster and cloud computing infrastructure. These frameworks like Apache Hadoop are well suited for large distributed and parallel tasks such as many computer vision algorithms. We discuss the design and implementation details of this network, together with the challenges faced in deploying such a large scale network on a research campus. We plan to make the datasets available for researchers in the computer vision community in the near future.
Abstract-Humans use context and scene knowledge to easily localize moving objects in conditions of complex illumination changes, scene clutter, and occlusions. In this paper, we present a method to leverage human knowledge in the form of annotated video libraries in a novel search and retrieval-based setting to track objects in unseen video sequences. For every video sequence, a document that represents motion information is generated. Documents of the unseen video are queried against the library at multiple scales to find videos with similar motion characteristics. This provides us with coarse localization of objects in the unseen video. We further adapt these retrieved object locations to the new video using an efficient warping scheme. The proposed method is validated on in-the-wild video surveillance data sets where we outperform state-of-the-art appearance-based trackers. We also introduce a new challenging data set with complex object appearance changes.Index Terms-Data-driven methods, video search and retrieval, visual object tracking.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.