The mechanisms underlying the acquisition of speech-production ability in human infancy are not well understood. We tracked 4-12-mo-old English-learning infants' and adults' eye gaze while they watched and listened to a female reciting a monologue either in their native (English) or nonnative (Spanish) language. We found that infants shifted their attention from the eyes to the mouth between 4 and 8 mo of age regardless of language and then began a shift back to the eyes at 12 mo in response to native but not nonnative speech. We posit that the first shift enables infants to gain access to redundant audiovisual speech cues that enable them to learn their native speech forms and that the second shift reflects growing native-language expertise that frees them to shift attention to the eyes to gain access to social cues. On this account, 12-mo-old infants do not shift attention to the eyes when exposed to nonnative speech because increasing native-language expertise and perceptual narrowing make it more difficult to process nonnative speech and require them to continue to access redundant audiovisual cues. Overall, the current findings demonstrate that the development of speech production capacity relies on changes in selective audiovisual attention and that this depends critically on early experience.human infants | multisensory perception | speech acquisition | cognitive development
Several theories have stressed the importance of intersensory integration for development but have not identified specific underlying integration mechanisms. The author reviews and synthesizes current knowledge about the development of intersensory temporal perception and offers a theoretical model based on epigenetic systems theory, proposing that responsiveness to 4 basic features of multimodal temporal experience--temporal synchrony, duration, temporal rate, and rhythm--emerges in a sequential, hierarchical fashion. The model postulates that initial developmental limitations make intersensory synchrony the basis for the integration of intersensory temporal relations and that the emergence of responsiveness to the other, increasingly more complex, temporal relations occurs in a hierarchical, sequential fashion by building on the previously acquired intersensory temporal processing skills.
Using a habituation/test procedure, the author investigated adults' and infants' perception of auditory-visual temporal synchrony. Participants were familiarized with a bouncing green disk and a sound that occurred each time the disk bounced. Then, they were given a series of asynchrony test trials where the sound occurred either before or after the disk bounced. The magnitude of the auditory-visual temporal asynchrony threshold differed markedly in adults and infants. The threshold for the detection of asynchrony created by a sound preceding a visible event was 65 ms in adults and 350 ms in infants and for the detection of asynchrony created by a sound following a visible event was 112 ms in adults and 450 ms in infants. Also, infants did not respond to asynchronies that exceeded intervals that yielded reliable discrimination. Infants' perception of auditory-visual temporal unity is guided by a synchrony and an asynchrony window, both of which become narrower in development.
The conventional view is that perceptual/cognitive development is an incremental process of acquisition. Several striking findings have revealed, however, that the sensitivity to non-native languages, faces, vocalizations, and music that is present early in life declines as infants acquire experience with native perceptual inputs. In the language domain, the decline in sensitivity is reflected in a process of perceptual narrowing that is thought to play a critical role during the acquisition of a native-language phonological system. Here, we provide evidence that such a decline also occurs in infant response to multisensory speech. We found that infant intersensory response to a non-native phonetic contrast narrows between 6 and 11 months of age, suggesting that the perceptual system becomes increasingly more tuned to key nativelanguage audiovisual correspondences. Our findings lend support to the notion that perceptual narrowing is a domain-general as well as a pan-sensory developmental process.audiovisual speech ͉ infants ͉ perceptual narrowing
Bilingual infants succeed at learning their first two languages. What adaptive processes enable them to master the more complex nature of bilingual input? One possibility is that bilingual infants take greater advantage of the redundancy of the audiovisual speech that they usually experience during social interactions. Thus, we investigated whether bilinguals’ need to keep languages apart increases their attention to the mouth as a source of redundant and reliable speech cues. We measured selective attention to talking faces in 4-, 8-, and 12-month-old Catalan- and Spanish- monolingual and bilingual infants. Monolingual data paralleled previous findings, whereas bilingual data suggested an emerging move away from the eyes beginning earlier in development, followed by increasing attention to the mouth from 8 to 12 months of age. Thus, bilingual infants exploit the greater perceptual salience of redundant audiovisual speech cues earlier and longer than monolinguals to support their dual language acquisition processes.
Between 6 and 10 months of age, infants become better at discriminating among native voices and human faces and worse at discriminating among nonnative voices and other species' faces. We tested whether these unisensory perceptual narrowing effects reflect a general ontogenetic feature of perceptual systems by testing across sensory modalities. We showed pairs of monkey faces producing two different vocalizations to 4-, 6-, 8-, and 10-month-old infants and asked whether they would prefer to look at the corresponding face when they heard one of the two vocalizations. Only the two youngest groups exhibited intersensory matching, indicating that perceptual narrowing is pan-sensory and a fundamental feature of perceptual development.crossmodal ͉ face processing ͉ multisensory F rom the moment of birth, infants find themselves in a socially rich environment where they see and hear other people. In order for them to have veridical and meaningful social experiences with such people, infants must be able to integrate particular faces and voices by detecting their correspondences. Indeed, a number of studies have shown that, beginning as early as 2 months of age, infants begin to exhibit the ability to perceive face-voice correspondences (1-8). Despite this fact, however, the developmental process underlying intersensory integration of faces and voices, as well as more general intersensory processes, remain poorly understood. Current theoretical views assume either that basic intersensory perceptual abilities are present at birth and become increasingly differentiated and refined over age (9) or that such abilities are not present at birth and only emerge gradually during the first years of life as a result of the child's active exploration of the world (10, 11).Most empirical evidence supports the former, differentiation, view in showing that basic intersensory perceptual abilities are already present in infancy and that as infants grow these abilities change and improve in significant ways (12, 13). For example, young human infants can perceive lower-order intersensory relations based on such attributes as intensity (14), temporal synchrony (15, 16), and duration (17), but do not integrate auditory and visual spatial cues (18). In contrast, older infants can perceive higher-order intersensory relations based on such attributes as affect (6) and gender (3), become capable of learning arbitrary intersensory associations (19) and can integrate auditory and visual spatial cues (18). Findings from studies of underlying neural mechanisms of intersensory integration in cats and rhesus monkeys show a similar pattern. Whereas multisensory cells in the superior colliculus of adult cats and rhesus monkeys integrate auditory and visual cues in spatial localization tasks, these cells do not integrate them in neonatal cats and monkeys (20,21). Together, extant findings suggest a general developmental pattern consisting of the initial emergence of low-level intersensory abilities, a subsequent agedependent refinement and improvement ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.