The important role of arousal in determining vocal parameters in the expression of emotion is well established. There is less evidence for the contribution of emotion dimensions such as valence and potency/control to vocal emotion expression. Here, an acoustic analysis of the newly developed Geneva Multimodal Emotional Portrayals corpus, is presented to examine the role of dimensions other than arousal. This corpus contains twelve emotions that systematically vary with respect to valence, arousal, and potency/control. The emotions were portrayed by professional actors coached by a stage director. The extracted acoustic parameters were first compared with those obtained from a similar corpus [Banse and Scherer (1996). J. Pers. Soc. Psychol. 70, 614-636] and shown to largely replicate the earlier findings. Based on a principal component analysis, seven composite scores were calculated and were used to determine the relative contribution of the respective vocal parameters to the emotional dimensions arousal, valence, and potency/control. The results show that although arousal dominates for many vocal parameters, it is possible to identify parameters, in particular spectral balance and spectral noise, that are specifically related to valence and potency/control.
This study investigates to what extent the amount of variation in a visual scene causes speakers to mention the attribute color in their definite target descriptions, focusing on scenes in which this attribute is not needed for identification of the target. The results of our three experiments show that speakers are more likely to redundantly include a color attribute when the scene variation is high as compared with when this variation is low (even if this leads to overspecified descriptions). We argue that these findings are problematic for existing algorithms that aim to automatically generate psychologically realistic target descriptions, such as the Incremental Algorithm, as these algorithms make use of a fixed preference order per domain and do not take visual scene variation into account.
Learning to recognize the contrasts of a language-specific phonemic repertoire can be viewed as forming categories in a multidimensional psychophysical space. Research on the learning of distributionally defined visual categories has shown that categories defined over 1 dimension are easy to learn and that learning multidimensional categories is more difficult but tractable under specific task conditions. In 2 experiments, adult participants learned either a unidimensional or a multidimensional category distinction with or without supervision (feedback) during learning. The unidimensional distinctions were readily learned and supervision proved beneficial, especially in maintaining category learning beyond the learning phase. Learning the multidimensional category distinction proved to be much more difficult and supervision was not nearly as beneficial as with unidimensionally defined categories. Maintaining a learned multidimensional category distinction was only possible when the distributional information that identified the categories remained present throughout the testing phase. We conclude that listeners are sensitive to both trial-by-trial feedback and the distributional information in the stimuli. Even given limited exposure, listeners learned to use 2 relevant dimensions, albeit with considerable difficulty.
Learning to recognize the contrasts of a language-specific phonemic repertoire can be viewed as forming categories in a multidimensional psychophysical space. Research on the learning of distributionally defined visual categories has shown that categories defined over 1 dimension are easy to learn and that learning multidimensional categories is more difficult but tractable under specific task conditions. In 2 experiments, adult participants learned either a unidimensional or a multidimensional category distinction with or without supervision (feedback) during learning. The unidimensional distinctions were readily learned and supervision proved beneficial, especially in maintaining category learning beyond the learning phase. Learning the multidimensional category distinction proved to be much more difficult and supervision was not nearly as beneficial as with unidimensionally defined categories. Maintaining a learned multidimensional category distinction was only possible when the distributional information that identified the categories remained present throughout the testing phase. We conclude that listeners are sensitive to both trial-by-trial feedback and the distributional information in the stimuli. Even given limited exposure, listeners learned to use 2 relevant dimensions, albeit with considerable difficulty.
a b s t r a c tIn dialogue, repeated references contain fewer words (which are also acoustically reduced) and fewer gestures than initial ones. In this paper, we describe three experiments studying to what extent gesture reduction is comparable to other forms of linguistic reduction. Since previous studies showed conflicting findings for gesture rate, we systematically compare two measures of gesture rate: gesture rate per word and per semantic attribute (Experiment I). In addition, we ask whether repetition impacts the form of gestures, by manual annotation of a number of features (Experiment I), by studying gradient differences using a judgment test (Experiment II), and by investigating how effective initial and repeated gestures are at communicating information (Experiment III). The results revealed no reduction in terms of gesture rate per word, but a U-shaped reduction pattern for gesture rate per attribute. Gesture annotation showed no reliable effects of repetition on gesture form, yet participants judged gestures from repeated references as less precise than those from initial ones. Despite this gradient reduction, gestures from initial and repeated references were equally successful in communicating information. Besides effects of repetition, we found systematic effects of visibility on gesture production, with more, longer, larger and more communicative gestures when participants could see each other. We discuss the implications of our findings for gesture research and for models of speech and gesture production.
Hand gestures are tightly coupled with speech and with action. Hence, recent accounts have emphasised the idea that simulations of spatio-motoric imagery underlie the production of co-speech gestures. In this study, we suggest that action simulations directly influence the iconic strategies used by speakers to translate aspects of their mental representations into gesture. Using a classic referential paradigm, we investigate how speakers respond gesturally to the affordances of objects, by comparing the effects of describing objects that afford action performance (such as tools) and those that do not, on gesture production. Our results suggest that affordances play a key role in determining the amount of representational (but not non-representational) gestures produced by speakers, and the techniques chosen to depict such objects. To our knowledge, this is the first study to systematically show a connection between object characteristics and representation techniques in spontaneous gesture production during the depiction of static referents.
Recent judgment studies have shown that people are able to fairly correctly attribute emotional states to others' bodily expressions. It is, however, not clear which movement qualities are salient, and how this applies to emotional gesture during speech-based interaction. In this study we investigated how the expression of emotions that vary on three major emotion dimensions-that is, arousal, valence, and potency-affects the perception of dynamic arm gestures. Ten professional actors enacted 12 emotions in a scenario-based social interaction setting. Participants (N = 43) rated all emotional expressions with muted sound and blurred faces on six spatiotemporal characteristics of gestural arm movement that were found to be related to emotion in previous research (amount of movement, movement speed, force, fluency, size, and height/vertical position). Arousal and potency were found to be strong determinants of the perception of gestural dynamics, whereas the differences between positive or negative emotions were less pronounced. These results confirm the importance of arm movement in communicating major emotion dimensions and show that gesture forms an integrated part of multimodal nonverbal emotion communication.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.