Abstract:This paper describes the dataset and vision challenges that form part of the PETS 2014 workshop. The datasets are multisensor sequences containing different activities around a parked vehicle in a parking lot. The dataset scenarios were filmed from multiple cameras mounted on the vehicle itself and involve multiple actors. In PETS2014 workshop, 22 acted scenarios are provided of abnormal behaviour around the parked vehicle. The aim in PETS 2014 is to provide a standard benchmark that indicates how detection, t… Show more
“…Furthermore, we evaluated the performance of the system on crowded environments by applying it to one of the challenging sequences of the PETS-benchmark [6] on people counting and compared it to the system published by [8].…”
Section: Implementation Details and Experimental Resultsmentioning
In this paper we present a system that tracks multiple persons by detection in real-time. We introduce a measure for similarity of detections which segments significant information from background clutter by using statistical information obtained during the learning phase of the detector. In order to track multiple persons we map the detections into flow networks utilizing this measure. A continuous realtime processing of video streams is accomplished by analyzing only small chunks of detections consecutively using different networks. By propagating the result of one network into the subsequent one a temporal consistent association is achieved. The system was evaluated using a standard video sequence containing a crowded scene and an own dataset with very long sequences. The results demonstrate that the system performs comparable to other systems while meeting real-time requirements.
“…Furthermore, we evaluated the performance of the system on crowded environments by applying it to one of the challenging sequences of the PETS-benchmark [6] on people counting and compared it to the system published by [8].…”
Section: Implementation Details and Experimental Resultsmentioning
In this paper we present a system that tracks multiple persons by detection in real-time. We introduce a measure for similarity of detections which segments significant information from background clutter by using statistical information obtained during the learning phase of the detector. In order to track multiple persons we map the detections into flow networks utilizing this measure. A continuous realtime processing of video streams is accomplished by analyzing only small chunks of detections consecutively using different networks. By propagating the result of one network into the subsequent one a temporal consistent association is achieved. The system was evaluated using a standard video sequence containing a crowded scene and an own dataset with very long sequences. The results demonstrate that the system performs comparable to other systems while meeting real-time requirements.
“…Additionally, α also defines the self-transition probability for each state. The second hyper-parameter γ, is employed in the Beta distribution of (13) and controls the size of the stick-break defined in (11), which furthermore defines the contribution of the remaining probability.…”
Section: B Methodologymentioning
confidence: 99%
“…Application domains vary from surveillance [11] to processing entertainment movies [29] and TV shows [34]. Sign language recognition has also been explored [6].…”
Abstract-We propose four variants of a novel hierarchical hidden Markov models strategy for rule induction in the context of automated sports video annotation including a multilevel Chinese takeaway process (MLCTP) based on the Chinese restaurant process and a novel Cartesian product label-based hierarchical bottom-up clustering (CLHBC) method that employs prior information contained within label structures. Our results show significant improvement by comparison against the flat Markov model: optimal performance is obtained using a hybrid method, which combines the MLCTP generated hierarchical topological structures with CLHBC generated event labels. We also show that the methods proposed are generalizable to other rule-based environments including human driving behavior and human actions.
“…Some methods proposed in literature for crowd detection perform image segmentation without actual counting or localization [1], while others simply estimate the coarse density range within local regions [24]. In terms of experimental data, most of the existing algorithms for exact counting have been tested on low to medium density crowds, e.g., USCD dataset with density of 11 − 46 people per frame [4], Mall dataset with density of 13 − 53 individuals per frame [5], and PETS dataset containing 3 − 40 people per frame [9]. In contrast to these images and videos, our algorithm has been tested on still images containing between 94 and 4543 people per image, with an average of 1280 people over fifty images in the dataset.…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.