The Iterative Closest Point (ICP)
We review methods for kinematic tracking of the human body in video. The review is part of a projected book that is intended to cross-fertilize ideas about motion representation between the animation and computer vision communities. The review confines itself to the earlier stages of motion, focusing on tracking and motion synthesis; future material will cover activity representation and motion generation.In general, we take the position that tracking does not necessarily involve (as is usually thought) complex multimodal inference problems. Instead, there are two key problems, both easy to state.The first is lifting, where one must infer the configuration of the body in three dimensions from image data. Ambiguities in lifting can result in multimodal inference problem, and we review what little is known about the extent to which a lift is ambiguous. The second is data association, where one must determine which pixels in an image come from the body. We see a tracking by detection approach as the most productive, and review various human detection methods.Lifting, and a variety of other problems, can be simplified by observing temporal structure in motion, and we review the literature on datadriven human animation to expose what is known about this structure. Accurate generative models of human motion would be extremely useful in both animation and tracking, and we discuss the profound difficulties encountered in building such models. Discriminative methods -which should be able to tell whether an observed motion is human or notdo not work well yet, and we discuss why.There is an extensive discussion of open issues. In particular, we discuss the nature and extent of lifting ambiguities, which appear to be significant at short timescales and insignificant at longer timescales. This discussion suggests that the best tracking strategy is to track a 2D representation, and then lift it. We point out some puzzling phenomena associated with the choice of human motion representation -joint angles vs. joint positions. Finally, we give a quick guide to resources.
One way that artists create compelling character animations is by manipulating details of a character's motion. This process is expensive and repetitive. We show that we can make such motion editing more efficient by generalizing the edits an animator makes on short sequences of motion to other sequences. Our method predicts frames for the motion using Gaussian process models of kinematics and dynamics. These estimates are combined with probabilistic inference. Our method can be used to propagate edits from examples to an entire sequence for an existing character, and it can also be used to map a motion from a control character to a very different target character. The technique shows good generalization. For example, we show that an estimator, learned from a few seconds of edited example animation using our methods, generalizes well enough to edit minutes of character animation in a high-quality fashion. Learning is interactive: An animator who wants to improve the output can provide small, correcting examples and the system will produce improved estimates of motion. We make this interactive learning process efficient and natural with a fast, full-body IK system with novel features. Finally, we present data from interviews with professional character animators that indicate that generalizing and propagating animator edits can save artists significant time and work.
Figure 1: This figure is a time-lapsed shot of a transition synthesized in real-time using our method. The character transitions in one second from walking to skipping in a seamless, natural way. We invite the reader to view this animation in the accompanying movie. AbstractWe describe a discriminative method for distinguishing naturallooking from unnatural-looking motion. Our method is based on physical and data-driven features of motion to which humans seem sensitive. We demonstrate that our technique is significantly more accurate than current alternatives.We use this technique as the testing part of a hypothesize-and-test motion synthesis procedure. The mechanism we build using this procedure can quickly provide an application with a transition of user-specified duration from any frame in a motion collection to any other frame in the collection. During pre-processing, we search all possible 2-, 3-, and 4-way blends between representative samples of motion obtained using clustering. The blends are automatically evaluated, and the recipe (i.e., the representatives and the set of weighting functions) that created the best blend is cached.At run-time, we build a transition between motions by matching a future window of the source motion to a representative, matching the past of the target motion to a representative, and then applying the blend recipe recovered from the cache to source and target motion. People seem sensitive to poor contact with the environment like sliding foot plants. We determine appropriate temporal and positional constraints for each foot plant using a novel technique, then apply an off-the-shelf inverse kinematics technique to enforce the constraints. This synthesis procedure yields good-looking transitions between distinct motions with very low online cost.
Figure 1: Motion editing can produce significant footskate (Section 1). On the left is an edited motion capture sequence. We superimpose partially translucent renderings of frames spaced evenly in time. As a result, a slowly moving part of the body -like the skating foot plant in this image -shows up as a dark region with blurry outlines. We introduce a robust oracle for detecting foot plants. When coupled with an off-the-shelf footskate remover, our system behaves like a black box (center) that cleans up motion at interactive rates (right). The foot is now planted firmly, as one can see from the sharp outline around the toe. Notice that the mild blur at the heel on the right results from the way the heel and then the toes are planted. AbstractFootskate, where a character's foot slides on the ground when it should be planted firmly, is a common artifact resulting from almost any attempt to modify motion capture data. We describe an online method for fixing footskate that requires no manual clean-up. An important part of fixing footskate is determining when the feet should be planted. We introduce an oracle that can automatically detect when foot plants should occur. Our method is more accurate than baseline methods that check the height or speed of the feet. These baseline methods perform especially poorly on noisy or imperfect data, requiring manual fixing. Once trained, our oracle is robust and can be used without manual clean-up, making it suitable for large databases of motion. After the foot plants are detected, we use an off-the-shelf inverse kinematics based method to maintain ground contact during each foot plant. Our foot plant detection mechanism coupled with an IK based fixer can be treated as a black box that produces natural-looking motion of the feet, making it suitable for interactive systems. We demonstrate several applications which would produce unrealistic motion without our method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.