We present a probabilistic multi-cue tracking approach constructed by employing a novel randomized template tracker and a constant color model based particle filter. Our approach is based on deriving simple binary confidence measures for each tracker which aid priority based switching between the two fundamental cues for state estimation. Thereby the state of the object is estimated from one of the two distributions associated to the cues at each tracking step. This switching also brings about interaction between the cues at irregular intervals in the form of cross sampling. Within this scheme, we tackle the important aspect of dynamic target model adaptation under randomized template tracking which, by construction, possesses the ability to adapt to changing object appearances. Further, to track the object through occlusions we interrupt sequential resampling and achieve relock using the color cue. In order to evaluate the efficacy of this scheme, we put it to test against several state of art trackers using the VIVID online evaluation program and make quantitative comparisons.
In this paper, we address the problem of object segmentation in multiple views or videos when two or more viewpoints of the same scene are available. We propose a new approach that propagates segmentation coherence information in both space and time, hence allowing evidences in one image to be shared over the complete set. To this aim the segmentation is cast as a single efficient labeling problem over space and time with graph cuts. In contrast to most existing multi-view segmentation methods that rely on some form of dense reconstruction, ours only requires a sparse 3D sampling to propagate information between viewpoints. The approach is thoroughly evaluated on standard multiview datasets, as well as on videos. With static views, results compete with state of the art methods but they are achieved with significantly fewer viewpoints. With multiple videos, we report results that demonstrate the benefit of segmentation propagation through temporal cues.
Human character animation is often critical in entertainment content production, including video games, virtual reality or fiction films. To this end, deep neural networks drive most recent advances through deep learning (DL) and deep reinforcement learning (DRL). In this article, we propose a comprehensive survey on the state‐of‐the‐art approaches based on either DL or DRL in skeleton‐based human character animation. First, we introduce motion data representations, most common human motion datasets and how basic deep models can be enhanced to foster learning of spatial and temporal patterns in motion data. Second, we cover state‐of‐the‐art approaches divided into three large families of applications in human animation pipelines: motion synthesis, character control and motion editing. Finally, we discuss the limitations of the current state‐of‐the‐art methods based on DL and/or DRL in skeletal human character animation and possible directions of future research to alleviate current limitations and meet animators' needs.
Multiple view segmentation consists in segmenting objects simultaneously in several views. A key issue in that respect and compared to monocular settings is to ensure propagation of segmentation information between views while minimizing complexity and computational cost. In this work, we first investigate the idea that examining measurements at the projections of a sparse set of 3D points is sufficient to achieve this goal. The proposed algorithm softly assigns each of these 3D samples to the scene background if it projects on the background region in at least one view, or to the foreground if it projects on foreground region in all views. Second, we show how other modalities such as depth may be seamlessly integrated in the model and benefit the segmentation. The paper exposes a detailed set of experiments used to validate the algorithm, showing results comparable with the state of art, with reduced computational complexity. We also discuss the use of different modalities for specific situations, such as dealing with a low number of viewpoints or a scene with color ambiguities between foreground and background.
We present a new method to extract multiple segmentations of an object viewed by multiple cameras, given only the camera calibration. We introduce the n-tuple color model to express interview consistency when inferring in each view the foreground and background color models permitting the final segmentation. A color n-tuple is a set of pixel colors associated to the n projections of a 3D point. The first goal is set as finding the MAP estimate of background/foreground color models based on an arbitrary sample set of such n-tuples, such that samples are consistently classified, in a soft way, as "empty" if they project in the background of at least one view, or "occupied" if they project to foreground pixels in all views. An Expectation Maximization framework is then used to alternate between color models and soft classifications. In a final step, all views are segmented based on their attached color models. The approach is significantly simpler and faster than previous multi-view segmentation methods, while providing results of equivalent or better quality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.