In this article the author proposes an episodic theory of spoken word representation, perception, and production. By most theories, idiosyncratic aspects of speech (voice details, ambient noise, etc.) are considered noise and are filtered in perception. However, episodic theories suggest that perceptual details are stored in memory and are integral to later perception. In this research the author tested an episodic model (MINERVA 2; D. L. Hintzman, 1986) against speech production data from a word-shadowing task. The model predicted the shadowing-response-time patterns, and it correctly predicted a tendency for shadowers to spontaneously imitate the acoustic patterns of words and nonwords. It also correctly predicted imitation strength as a function of "abstract" stimulus properties, such as word frequency. Taken together, the data and theory suggest that detailed episodes constitute the basic substrate of the mental lexicon.
Most theories of spoken word identification assume that variable speech signals are matched to canonical representations in memory. To achieve this, idiosyncratic voice details are first normalized, allowing direct comparison of the input to the lexicon. This investigation assessed both explicit and implicit memory for spoken words as a function of speakers' voices, delays between study and test, and levels of processing. In 2 experiments, voice attributes of spoken words were clearly retained in memory. Moreover, listeners were sensitive to fine-grained similarity between 1st and 2nd presentations of different-voice words, but only when words were initially encoded at relatively shallow levels of processing. The results suggest that episodic memory traces of spoken words retain the surface details typically considered as noise in perceptual systems.
Recognition memory for spoken words was investigated with a continuous recognition memory task. Independent variables were number of intervening words (lag) between initial and subsequent presentations of a word, total number of talkers in the stimulus set, and whether words were repeated in the same voice or a different voice. In Experiment 1, recognition judgments were based on word identity alone. Same-voice repetitions were recognized more quickly and accurately than different-voice repetitions at all values of lag and at all levels of talker variability. In Experiment 2, recognition judgments were based on both word identity and voice identity. Subjects recognized repeated voices quite accurately. Gender of the talker affected voice recognition but not item recognition. These results suggest that detailed information about a talker's voice is retained in long-term episodic memory representations of spoken words.
Two experiments employing an auditory priming paradigm were conducted to test predictions of the Neighborhood Activation Model of spoken word recognition (Luce & Pisoni, 1989, . Manuscript under review). Acoustic-phonetic similarity, neighborhood densities, and frequencies of prime and target words were manipulated. In Experiment 1, priming with low frequency, phonetically related spoken words inhibited target recognition, as predicted by the Neighborhood Activation Model. In Experiment 2, the same prime-target pairs were presented with a longer inter-stimulus interval and the effects of priming were eliminated. In both experiments, predictions derived from the Neighborhood Activation Model regarding the effects of neighborhood density and word frequency were supported. The results are discussed in terms of competing activation of lexical neighbors and the dissociation of activation and frequency in spoken word recognition.
Perception is described within a complex systems framework that includes several constructs: resonance, attractors, subsymbols, and design principles. This framework was anticipated in]. 3. Gibson's ecological approach (M. T. Turvey & C. Carello, 1981), but it is extended to cognitive phenomena by assuming experiential realism instead of ecological realism. The framework is applied in this article to explain phonologic mediation in reading and a complex array of published naming and lexical decision data. The full account requires only two design principles: covariant learning and self-consistency. Nonetheless, it organizes and explains a vast empirical literature on printed word perception.
In a recent study, Martin, Mullennix, Pisoni, and Summers (1989) reported that subjects' accuracy in recalling lists of spoken words was better for words in early list positions when the words were spoken by a single talker than when they were spoken by multiple talkers. The present study was conducted to examine the nature of these effects in further detail. Accuracy of serial-ordered recall was examined for lists of words spoken by either a single talker or by multiple talkers. Half the lists contained easily recognizable words, and half contained more difficult words, according to a combined metric of word frequency, lexical neighborhood density, and neighborhood frequency. Rate of presentation was manipulated to assess the effects of both variables on rehearsal and perceptual encoding. A strong interaction was obtained between talker variability and rate of presentation. Recall of multiple-talker lists was affected much more than single-talker lists by changes in presentation rate. At slow presentation rates, words in early serial positions produced by multiple talkers were actually recalled more accurately than words produced by a single talker. No interaction was observed for word confusability and rate of presentation. The data provide support for the proposal that talker variability affects the accuracy of recall of spoken words not only by increasing the processing demands for early perceptual encoding of the words, but also by affecting the efficiency of the rehearsal process itself.The perception of spoken language is a skill that requires the listener to extract stable linguistic content from a physical signal that is notoriously unstable. The acoustic realization of speech is simultaneously modulated by numerous variable phonetic, prosodic, and semantic characteristics inherent to the particular message. The speech signal is further modulated by variable source characteristics. Indeed, it has frequently been suggested that if the perceptual system is required to extract canonical units of meaning from spoken language, various idiosyncracies related to different talkers must be somehow "normalized" during an early stage of processing the sensory input (e.g., Joos, 1948). Such talker-specific sources of variability include differing individual dialects, vocal tract sizes and shapes, and speaking rates (Mullennix, Pisoni, & Martin 1988;Verbrugge, Strange, Shankweiler, & Edman, 1976).Sources of variability in speech, whether from phonetic context, spoken stress, or talker variations, have typically been considered "perceptual problems" to be solved by listeners, just as they must be solved for the design of adequate speech-recognition systems (e.g., Gerstman, 1968;. Although the communicative importance of talker-specific information has long been recognized (e.g., Ladefoged & Broadbent, 1957), only recently have questions regarding the encoding of talker-specific information been considered alongside questions regarding the perception of the phonetic content of speech. By assessing the effects ...
The own-race bias (ORB) is a well-known finding wherein people are better able to recognize and discriminate own-race faces, relative to cross-race faces. In 2 experiments, participants viewed Asian and Caucasian faces, in preparation for recognition memory tests, while their eye movements and pupil diameters were continuously monitored. In Experiment 1 (with Caucasian participants), systematic differences emerged in both measures as a function of depicted race: While encoding cross-race faces, participants made fewer (and longer) fixations, they preferentially attended to different sets of features, and their pupils were more dilated, all relative to own-race faces. Also, in both measures, a pattern emerged wherein some participants reduced their apparent encoding effort to cross-race faces over trials. In Experiment 2 (with Asian participants), the authors observed the same patterns, although the ORB favored the opposite set of faces. Taken together, the results suggest that the ORB appears during initial perceptual encoding. Relative to own-race face encoding, cross-race encoding requires greater effort, which may reduce vigilance in some participants.
Phonological priming of spoken words refers to improved recognition of targets preceded by primes that share at least one of their constituent phonemes (e.g., BULL-BEER). Phonetic priming refers to reduced recognition of targets preceded by primes that share no phonemes with targets but are phonetically similar to targets (e.g., BULL-VEER). Five experiments were conducted to investigate the role of bias in phonological priming. Performance was compared across conditions of phonological and phonetic priming under a variety of procedural manipulations. Ss in phonological priming conditions systematically modified their responses on unrelated priming trials in perceptual identification, and they were slower and more errorful on unrelated trials in lexical decision than were Ss in phonetic priming conditions. Phonetic and phonological priming effects display different time courses and also different interactions with changes in proportion of related priming trials. Phonological priming involves bias; phonetic priming appears to reflect basic properties of activation and competition in spoken word recognition.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.