Recent evidence (Maye, Werker & Gerken, 2002) suggests that statistical learning may be an important mechanism for the acquisition of phonetic categories in the infant's native language. We examined the sufficiency of this hypothesis and its implications for development by implementing a statistical learning mechanism in a computational model based on a Mixture of Gaussians (MOG) architecture. Statistical learning alone was found to be insufficient for phonetic category learning-an additional competition mechanism was required in order to successfully learn the categories in the input. When competition was added to the MOG architecture, this class of models successfully accounted for developmental enhancement and loss of sensitivity to phonetic contrasts. Moreover, the MOG with competition model was used to explore a potentially important distributional property of early speech categories --sparseness --in which portions of the space between phonetic categories is unmapped. Sparseness was found in all successful models and quickly emerged during development even when the initial parameters favored continuous representations with no gaps. The implications of these models for phonetic category learning in infants are discussed.Infants face a difficult problem in acquiring their native language because the acoustic/ phonetic variability in the input far exceeds the limited number of distinctive differences that define language-specific phonemes. How do infants attend to the relevant information that distinguishes words? Recent evidence suggests that phonemic categories may be induced, in whole or in part, by a rapid statistical learning mechanism that is sensitive to the distributional properties of phonetic input (Maye, Werker & Gerken, 2002;Maye, Weiss & Aslin, 2008). This evidence suggests that the detailed frequency-of-occurrence of tokens along continuous speech dimensions plays a crucial role in the formation and modification of phonemic categories.The present paper describes a computational model of statistical speech category learning that examines the necessary and sufficient mechanisms needed to account for known empirical data from infants, and the implications of those mechanisms for early speech categories. We demonstrate that statistical learning alone is insufficient: competition is also required. However, once this feature is added to the model, it can account for a number of developmental trajectories in speech category learning. Finally, we examine the possibility Statistical Learning and DevelopmentThe classic view of speech perception in both adults (cf., Liberman, Harris, Hoffman & Griffith, 1957) and infants (cf., Eimas, Siqueland, Jusczyk & Vigorito, 1971; see Jusczyk, 1997) is that stop consonants are perceived categorically. However, more recent evidence confirms within-category sensitivity in both adults (Pisoni & Tash, 1974; Carney, Widen & Viemeister, 1979;Miller, 1997) and infants (Miller & Eimas, 1996;McMurray & Aslin, 2005). Nevertheless, adults and infants have a bias t...
During speech perception, listeners make judgments about the phonological category of sounds by taking advantage of multiple acoustic cues for each phonological contrast. Perceptual experiments have shown that listeners weight these cues differently. How do listeners weight and combine acoustic cues to arrive at an overall estimate of the category for a speech sound? Here, we present several simulations using a mixture of Gaussians models that learn cue weights and combine cues on the basis of their distributional statistics. We show that a cue-weighting metric in which cues receive weight as a function of their reliability at distinguishing phonological categories provides a good fit to the perceptual data obtained from human listeners, but only when these weights emerge through the dynamics of learning. These results suggest that cue weights can be readily extracted from the speech signal through unsupervised learning processes.
Face masks are an important tool for preventing the spread of COVID-19. However, it is unclear how different types of masks affect speech recognition in different levels of background noise. To address this, we investigated the effects of four masks (a surgical mask, N95 respirator, and two cloth masks) on recognition of spoken sentences in multi-talker babble. In low levels of background noise, masks had little to no effect, with no more than a 5.5% decrease in mean accuracy compared to a no-mask condition. In high levels of noise, mean accuracy was 2.8-18.2% lower than the no-mask condition, but the surgical mask continued to show no significant difference. The results demonstrate that different types of masks generally yield similar accuracy in low levels of background noise, but differences between masks become more apparent in high levels of noise.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.