The ability to statistically segment a continuous auditory stream is one of the most important preparations for initiating language learning. Such ability is available to human infants at 8 months of age, as shown by a behavioral measurement. However, behavioral study alone cannot determine how early this ability is available. A recent study using measurements of event-related potential (ERP) revealed that neonates are able to detect statistical boundaries within auditory streams of speech syllables. Extending this line of research will allow us to better understand the cognitive preparation for language acquisition that is available to neonates. The aim of the present study was to examine the domain-generality of such statistical segmentation. Neonates were presented with nonlinguistic tone sequences composed of four tritone units, each consisting of three semitones extracted from one octave, for two 5-minute sessions. Only the first tone of each unit evoked a significant positivity in the frontal area during the second session, but not in the first session. This result suggests that the general ability to distinguish units in an auditory stream by statistical information is activated at birth and is probably innately prepared in humans.
The paper describes an application of machine learning techniques to identify expiratory and inspiration phases from the audio recording of human baby cries. Crying episodes were recorded from 14 infants, spanning four vocalization contexts in their first 12 months of age; recordings from three individuals were annotated manually to identify expiratory and inspiratory sounds and used as training examples to segment automatically the recordings of the other 11 individuals. The proposed algorithm uses a hidden Markov model architecture, in which state likelihoods are estimated either with Gaussian mixture models or by converting the classification decisions of a support vector machine. The algorithm yields up to 95% classification precision (86% average), and its ability generalizes over different babies, different ages, and vocalization contexts. The technique offers an opportunity to quantify expiration duration, count the crying rate, and other time-related characteristics of baby crying for screening, diagnosis, and research purposes over large populations of infants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.