Two experiments investigated the effect of reverberation on listeners' ability to perceptually segregate two competing voices. Culling et al. [Speech Commun. 14, 71-96 (1994)] found that for competing synthetic vowels, masked identification thresholds were increased by reverberation only when combined with modulation of fundamental frequency (F0). The present investigation extended this finding to running speech. Speech reception thresholds (SRTs) were measured for a male voice against a single interfering female voice within a virtual room with controlled reverberation. The two voices were either (1) co-located in virtual space at 0 degrees azimuth or (2) separately located at +/-60 degrees azimuth. In experiment 1, target and interfering voices were either normally intonated or resynthesized with a fixed F0. In anechoic conditions, SRTs were lower for normally intonated and for spatially separated sources, while, in reverberant conditions, the SRTs were all the same. In experiment 2, additional conditions employed inverted F0 contours. Inverted F0 contours yielded higher SRTs in all conditions, regardless of reverberation. The results suggest that reverberation can seriously impair listeners' ability to exploit differences in F0 and spatial location between competing voices. The levels of reverberation employed had no effect on speech intelligibility in quiet.
Perceptual separation of speech from interfering noise using binaural cues and fundamental frequency (F0) differences is disrupted by reverberation [Plomp, Acoustica 34, 200–211; Culling et al., Speech Commun. 14, 71–96]. Culling et al. found that the effect of F0 differences on vowel identification was robust in reverberation unless combined with even subtle F0 modulation. In the current study, speech reception thresholds (SRTs) were measured against a single competing voice. Both voices were either monotonized or normally intonated. Each came from recordings of the same voice, but interfering sentences were feminized (F0 increased 80%; vocal-tract length reduced 20%). The voices were presented from either the same or from different locations within anechoic and reverberant virtual rooms of Culling et al. In anechoic conditions, SRTs were lower when the voices were spatially separated and/or intonated, indicating that intonated speech is more intelligible than monotonous speech. In reverberant conditions (T60=400 ms), SRTs were higher, with no differences between the conditions. A follow-up experiment introduced sentences with inverted F0 contours. While acceptable in quiet, these sentences gave higher SRTs in all conditions. It appears that reverberant conditions leave intonated speech intelligible, but make it unseparable, while monotonous speech remains separable but is unintelligible.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.