Accounts of the identification of words and talkers commonly rely on different acoustic properties. To identify a word, a perceiver discards acoustic aspects of an utterance that are talker specific, forming an abstract representation of the linguistic message with which to probe a mental lexicon. To identify a talker, a perceiver discards acoustic aspects of an utterance specific to particular phonemes, creating a representation of voice quality with which to search for familiar talkers in long-term memory. In 3 experiments, sinewave replicas of natural speech sampled from 10 talkers eliminated natural voice quality while preserving idiosyncratic phonetic variation. Listeners identified the sinewave talkers without recourse to acoustic attributes of natural voice quality. This finding supports a revised description of speech perception in which the phonetic properties of utterances serve to identify both words and talkers.
In 5 experiments, the authors investigated how listeners learn to recognize unfamiliar talkers and how experience with specific utterances generalizes to novel instances. Listeners were trained over several days to identify 10 talkers from natural, sinewave, or reversed speech sentences. The sinewave signals preserved phonetic and some suprasegmental properties while eliminating natural vocal quality. In contrast, the reversed speech signals preserved vocal quality while distorting temporally based phonetic properties. The training results indicate that listeners learned to identify talkers even from acoustic signals lacking natural vocal quality. Generalization performance varied across the different signals and depended on the salience of phonetic information. The results suggest similarities in the phonetic attributes underlying talker recognition and phonetic perception.When a talker produces an utterance, the listener simultaneously apprehends the linguistic form of the message as well as the nonlinguistic attributes of the talker's unique vocal anatomy and pronunciation habits. Anatomical and stylistic differences in articulation convey an array of personal or indexical qualities, such as personal identity, sex, approximate age, ethnicity, personality, intentions or emotional state, level of alcohol intoxication, and facial expression (see Bricker
The personal attributes of a talker perceived via acoustic properties of speech are commonly considered to be an extralinguistic message of an utterance. Accordingly, accounts of the perception of talker attributes have emphasized a causal role of aspects of the fundamental frequency and coarsegrain acoustic spectra distinct from the detailed acoustic correlates of phonemes. In testing this view, in four experiments, we estimated the ability of listeners to ascertain the sex or the identity of 5 male and 5 female talkers from sinusoidal replicas of natural utterances, which lack fundamental frequency and natural vocal spectra. Given such radically reduced signals, listeners appeared to identify a talker's sex according to the central spectral tendencies of the sinusoidal constituents. Under acoustic conditions that prevented listeners from determining the sex of a talker, individual identification from sinewave signals was often successful. These results reveal that the perception of a talker's sex and identity are not contingent and that fine-grain aspects of a talker's phonetic production can elicit individual identification under conditions that block the perception of voice quality.What can a listener perceive in the speech of an unfamiliar talker? Even a brief utterance can convey a linguistic message and something about the talker who produced it. Although the perception ofpersonal attributes has commonly been explained by an account separate from the perception of linguistic properties, a recent study has shown that phonetic details can also be used to identify talkers and to distinguish them from one another (Remez, Fellowes, & Rubin, 1997). Surprisingly, when acoustic test materials forced performance to depend on phonetic attributes, listeners occasionally mistook male talkers for female talkers, and vice versa. The present report describes a series of experiments intended to clarify the interpretation ofthis counterintuitive finding, posing these questions: (1) Is the sex of a talker identifiable in a sine wave utterance replica? (2) Are differences across talkers in the central spectral tendency ofthe sinusoidal constituents responsible for differing impressions of the sex of a sine wave talker? (3) Are individuals identifiable under acoustic conditions that preclude the identification of sex?Many studies of talker recognition by ear, by automatic classification, or by visual inspection of spectroThis research was supported by Grants DC00308 (to R.E.R.) and HDOl994 (to Haskins Laboratories) from the National Institutes of Health.The authors gratefully acknowledge the meticulous assistance and trenchant advice of Chris Darwin, Steve Goldinger, Harry Levitt, Jennifer Lipton, Larry Rosenblum, Jim Sawusch, Dalia Shoretz, Saskia Smith, Steve Stroessner, Doug Whalen, and Fay Xing. Correspondence should be addressed to R. E. Remez, Department of Psychology, Barnard College, 3009 Broadway, New York, NY 10027-6598 (e-mail: remez@paradise.barnard.columbia.edu).grams have sought to tie variation across individu...
Theoretical and practical motives alike have prompted recent investigations of multimodal speech perception. Theoretically, multimodal studies have extended the conceptualization of perceptual organization beyond the familiar modality-bound accounts deriving from Gestalt psychology. Practically, such investigations have been driven by a need to understand the proficiency of multimodal speech perception using an electrocochlear prosthesis for hearing. In each domain, studies have shown that perceptual organization of speech can occur even when the perceiver's auditory experience departs from natural speech qualities. Accordingly, our research examined auditor-visual multimodal integration of videotaped faces and selected acoustic constituents of speech signals, each realized as a single sinewave tone accompanying a video image of an articulating face. The single tone reproduced the frequency and amplitude of the phonatory cycle or of one of the lower three oral formants. Our results showed a distinct advantage for the condition pairing the video image of the face with a sinewave replicating the second formant, despite its unnatural timbre and its presentation in acoustic isolation from the rest of the speech signal. Perceptual coherence of multimodal speech in these circumstances is established when the two modalities concurrently specify the same underlying phonetic attributes.
A listener who recognizes a talker notices characteristic attributes of the talker's speech despite the novelty of each utterance. Accounts of talker perception have often presumed that consistent aspects of an individual's speech, termed indexical properties, are ascribable to a talker's unique anatomy or consistent vocal posture distinct from acoustic correlates of phonetic contrasts. Accordingly, the perception of a talker is acknowledged to occur independently of the perception of a linguistic message. Alternatively, some studies suggest that attention to attributes of a talker includes indexical linguistic attributes conveyed in the articulation of consonants and vowels. This investigation sought direct evidence of attention to phonetic attributes of speech in perceiving talkers. Natural samples and sinewave replicas derived from them were used in three experiments assessing the perceptual properties of natural and sine-wave sentences; of temporally veridical and reversed natural and sine-wave sentences; and of an acoustic correlate of vocal tract scale to judgments of sine-wave talker similarity. The results revealed that the subjective similarity of individual talkers is preserved in the absence of natural vocal quality; and that local phonetic segmental attributes as well as global characteristics of speech can be exploited when listeners notice characteristics of talkers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.