The growing use of virtual humans demands generating increasingly realistic behavior for them while minimizing cost and time. Gestures are a key ingredient for realistic and engaging virtual agents and consequently automatized gesture generation has been a popular area of research. So far, good gesture generation has relied on explicit formulation of if-then rules and probabilistic modelling of annotated features. Machine learning approaches have yielded only marginal success, indicating a high complexity of the speech-to-motion learning task. In this work, we explore the use of transfer learning using previous motion modelling research to improve learning outcomes for gesture generation from speech. We use a recurrent network with an encoder-decoder structure that takes in prosodic speech features and generates a short sequence of gesture motion. We pre-train the network with a motion modelling task. We recorded a large multimodal database of conversational speech for the purpose of this work.
Although the perception of emotion in individuals is an important social skill, very little is known about how emotion is determined from a crowd of individuals. We investigated the perception of emotion in scenes of crowds populated by dynamic characters each expressing an emotion. Facial expressions were masked in these characters and emotion was conveyed using body motion and posture only. We systematically varied the proportion of characters in each scene depicting one of two emotions and participants were required to categorise the overall emotion of the crowd. In Experiment 1, we found that the perception of emotions in a crowd is efficient even with relatively brief exposures of the crowd stimuli. Furthermore, the emotion of a crowd was generally determined by the relative proportions of characters conveying it, although we also found that some emotions dominated perception. In Experiment 2, we found that an increase in crowd size was not associated with a relative decrease in the efficiency with which the emotion was categorised. Our findings suggest that body motion is an important social cue in perceiving the emotion of crowds and have implications for our understanding of how we perceive social information from groups.
When simulating large crowds, it is inevitable that the models and motions of many virtual characters will be cloned. However, the perceptual impact of this trade-off has never been studied. In this paper, we consider the ways in which an impression of variety can be created and the perceptual consequences of certain design choices. In a series of experiments designed to test people's perception of variety in crowds, we found that clones of appearance are far easier to detect than motion clones. Furthermore, we established that cloned models can be masked by color variation, random orientation, and motion. Conversely, the perception of cloned motions remains unaffected by the model on which they are displayed. Other factors that influence the ability to detect clones were examined, such as proximity, model type and characteristic motion. Our results provide novel insights and useful thresholds that will assist in creating more realistic, heterogeneous crowds.
Figure 1: Male avatar rendered in different visual styles, ranging from realistic to abstract, based on the results in Section 5.
AbstractThe realistic depiction of lifelike virtual humans has been the goal of many movie makers in the last decade. Recently, films such as Tron: Legacy and The Curious Case of Benjamin Button have produced highly realistic characters. In the real-time domain, there is also a need to deliver realistic virtual characters, with the increase in popularity of interactive drama video games (such as L.A. Noire TM or Heavy Rain TM ). There have been mixed reactions from audiences to lifelike characters used in movies and games, with some saying that the increased realism highlights subtle imperfections, which can be disturbing. Some developers opt for a stylized rendering (such as cartoon-shading) to avoid a negative reaction [Thompson 2004]. In this paper, we investigate some of the consequences of choosing realistic or stylized rendering in order to provide guidelines for developers for creating appealing virtual characters. We conducted a series of psychophysical experiments to determine whether render style affects how virtual humans are perceived. Motion capture with synchronized eye-tracked data was used throughout to animate custom-made virtual model replicas of the captured actors.
Virtual characters that appear almost photo-realistic have been shown to induce negative responses from viewers in traditional media, such as film and video games. This effect, described as the uncanny valley, is the reason why realism is often avoided when the aim is to create an appealing virtual character. In Virtual Reality, there have been few attempts to investigate this phenomenon and the implications of rendering virtual characters with high levels of realism on user enjoyment. In this paper, we conducted a large-scale experiment on over one thousand members of the public in order to gather information on how virtual characters are perceived in interactive virtual reality games. We were particularly interested in whether different render styles (realistic, cartoon, etc.) would directly influence appeal, or if a character's personality was the most important indicator of appeal. We used a number of perceptual metrics such as subjective ratings, proximity, and attribution bias in order to test our hypothesis. Our main result shows that affinity towards virtual characters is a complex interaction between the character's appearance and personality, and that realism is in fact a positive choice for virtual characters in virtual reality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.