Purpose Word recognition in quiet and in background noise has been thoroughly investigated in previous research to establish segmental speech recognition performance as a function of stimulus characteristics (e.g., audibility). Similar methods to investigate recognition performance for suprasegmental information (e.g., acoustic cues used to make judgments of talker age, sex, or emotional state) have not been performed. In this work, we directly compared emotion and word recognition performance in different levels of background noise to identify psychoacoustic properties of emotion recognition (globally and for specific emotion categories) relative to word recognition. Method Twenty young adult listeners with normal hearing listened to sentences and either reported a target word in each sentence or selected the emotion of the talker from a list of options (angry, calm, happy, and sad) at four signal-to-noise ratios in a background of white noise. Psychometric functions were fit to the recognition data and used to estimate thresholds (midway points on the function) and slopes for word and emotion recognition. Results Thresholds for emotion recognition were approximately 10 dB better than word recognition thresholds, and slopes for emotion recognition were half of those measured for word recognition. Low-arousal emotions had poorer thresholds and shallower slopes than high-arousal emotions, suggesting greater confusion when distinguishing low-arousal emotional speech content. Conclusions Communication of a talker's emotional state continues to be perceptible to listeners in competitive listening environments, even after words are rendered inaudible. The arousal of emotional speech affects listeners' ability to discriminate between emotion categories.
In the current study, an interactive approach is used to explore possible contributors to the misattributions listeners make about female talker expression of confidence. To do this, the expression and identification of confidence was evaluated through the evaluation of talker- (e.g., talker knowledge and affective acoustic modulation) and listener-specific factors (e.g., interaction between talker acoustic cues and listener knowledge). Talker and listener contexts were manipulated by implementing a social constraint for talkers and withholding information from listeners. Results indicated that listeners were sensitive to acoustic information produced by the female talkers in this study. However, when world knowledge and acoustics competed, judgments of talker confidence by listeners were less accurate. In fact, acoustic cues to female talker confidence were more accurately used by listeners as a cue to perceived confidence when relevant world knowledge was missing. By targeting speech dynamics between female talkers and both female and male listeners, the current study provides a better understanding of how confidence is realized acoustically and, perhaps more importantly, how those cues may be interpreted/misinterpreted by listeners.
One's ability to express confidence is critical to achieve one's goals in a social context—such as commanding respect from others, establishing higher social status, and persuading others. How individuals perceive confidence may be shaped by the socio-indexical cues produced by the speaker. In the current production/perception study, we asked four speakers (two cisgender women/men) to answer trivia questions under three speaking contexts: natural, overconfident, and underconfident (i.e., lack of confidence). An evaluation of the speakers' acoustics indicated that the speakers significantly varied their acoustic cues as a function of speaking context and that the women and men had significantly different acoustic cues. The speakers' answers to the trivia questions in the three contexts (natural, overconfident, underconfident) were then presented to listeners ( N = 26) in a social judgment task using a computer mouse-tracking paradigm. Listeners were sensitive to the speakers' acoustic modulations of confidence and differentially interpreted these cues based on the perceived gender of the speaker, thereby impacting listeners' cognition and social decision making. We consider, then, how listeners' social judgments about confidence were impacted by gender stereotypes about women and men from social, heuristic-based processes.
Purpose Emotion classification for auditory stimuli typically employs 1 of 2 approaches (discrete categories or emotional dimensions). This work presents a new emotional speech set, compares these 2 classification methods for emotional speech stimuli, and emphasizes the need to consider the entire communication model (i.e., the talker, message, and listener) when studying auditory emotion portrayal and perception. Method Emotional speech from male and female talkers was evaluated using both categorical and dimensional rating methods. Ten young adult listeners (ages 19–28 years) evaluated stimuli recorded in 4 emotional speaking styles (Angry, Calm, Happy, and Sad). Talker and listener factors were examined for potential influences on emotional ratings using categorical and dimensional rating methods. Listeners rated stimuli by selecting an emotion category, rating the activation and pleasantness, and indicating goodness of category fit. Results Discrete ratings were generally consistent with dimensional ratings for speech, with accuracy for emotion recognition well above chance. As stimuli approached dimensional extremes of activation and pleasantness, listeners were more confident in their category selection, indicative of a hybrid approach to emotion classification. Female talkers were rated as more activated than male talkers, and female listeners gave higher ratings of activation compared to male listeners, confirming gender differences in emotion perception. Conclusion A hybrid model for auditory emotion classification is supported by the data. Talker and listener factors, such as gender, were found to impact the ratings of emotional speech and must be considered alongside stimulus factors in the design of future studies of emotion.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.