Animated characters that move and gesticulate appropriately with spoken text are useful in a wide range of applications. Unfortunately, this class of movement is very difficult to generate, even more so when a unique, individual movement style is required. We present a system that, with a focus on arm gestures, is capable of producing full-body gesture animation for given input text in the style of a particular performer. Our process starts with video of a person whose gesturing style we wish to animate. A tool-assisted annotation process is performed on the video, from which a statistical model of the person's particular gesturing style is built. Using this model and input text tagged with theme, rheme and focus, our generation algorithm creates a gesture script. As opposed to isolated singleton gestures, our gesture script specifies a stream of continuous gestures coordinated with speech. This script is passed to an animation system, which enhances the gesture description with additional detail. It then generates either kinematic or physically simulated motion based on this description. The system is capable of generating gesture animations for novel text that are consistent with a given performer's style, as was successfully validated in an empirical user study.
The empirical investigation of human gesture stands at the center of multiple research disciplines, and various gesture annotation schemes exist, with varying degrees of precision and required annotation effort. We present a gesture annotation scheme for the specific purpose of automatically generating and animating character-specific hand/arm gestures, but with potential general value. We focus on how to capture temporal structure and locational information with relatively little annotation effort. The scheme is evaluated in terms of how accurately it captures the original gestures by re-creating those gestures on an animated character using the annotated data. This paper presents our scheme in detail and compares it to other approaches.
Speech synchronized facial animation that controls only the movement of the mouth is typically perceived as wooden and unnatural. We propose a method to generate additional facial expressions such as movement of the head, the eyes, and the eyebrows fully automatically from the input speech signal. This is achieved by extracting prosodic parameters such as pitch flow and power spectrum from the speech signal and using them to control facial animation parameters in accordance to results from paralinguistic research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.