Privacy-preserving high-quality people detection is a vital computer vision task for various indoor scenarios, e.g. people counting, customer behavior analysis, ambient assisted living or smart homes. In this work a novel approach for people detection in multiple overlapping depth images is proposed. We present a probabilistic framework utilizing a generative scene model to jointly exploit the multi-view image evidence, allowing us to detect people from arbitrary viewpoints. Our approach makes use of meanfield variational inference to not only estimate the maximum a posteriori (MAP) state but to also approximate the posterior probability distribution of people present in the scene. Evaluation shows state-of-the-art results on a novel data set for indoor people detection and tracking in depth images from the top-view with high perspective distortions. Furthermore it can be demonstrated that our approach (compared to the the monoview setup) successfully exploits the multi-view image evidence and robustly converges in only a few iterations. INDEX TERMS Depth sensor indoor surveillance, depth sensor networks, generative scene model, joint multi-view person detection, mean-field variational inference, multi-camera person detection, people detection in top-view, vertical top-view pedestrian detection.
In this work a novel approach for multi depth sensor person detection and tracking from top view is presented. We propose a probabilistic framework formulating the problem of people detection in multiple overlapping depth images as an inverse problem. As a generative forward model, we employ a simple differentiable 3D person model allowing us to detect people from arbitrary viewpoints. Furthermore, we extend our probabilistic framework to allow for tracking of individuals over time. Finally, we show how to solve for the global person trajectories exploiting differentiable rendering. The preliminary evaluation shows promising qualitative results of our approach on samples of three stereo vision based depth sensors observing an indoor scene. Index Terms-multi camera person detection and tracking; multi sensor fusion; network of depth cameras; inverse graphics; inverse problem; generative model; differentiable rendering
We present an RGBD infant head reconstruction method with a mobile phone depth sensor on a novel dataset. Acquiring 3D models from infants enables many important medical tasks such as automatic cranial asymmetry classification for plagiocephaly therapy progress estimation. Existing methods for 3D infant head reconstruction employ synchronized multi-view configurations or hand-held laser scanning methods making their widespread employment difficult. In contrast, RGBD reconstruction methods either rely on static scenes failing on this task due to rapid infant head movements or employ dynamic methods lacking the high fidelity surface reconstructions required for accurate cranial measurements. We propose a domain-specific 3D reconstruction method augmenting static RGBD methods focusing on the rigid parts of the head and exploiting scene knowledge about the data acquisition methodology. We evaluate our approach using provided ground truth anthropometric measurements of the biparietal diameter and report competitive accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.