Infant's spontaneous movements mirror integrity of brain networks, and thus also predict the future development of higher cognitive functions. Early recognition of infants with compromised motor development holds promise for guiding early therapies to improve lifelong neurocognitive outcomes. It has been challenging, however, to assess motor performance in ways that are objective and quantitative. Novel wearable technology has shown promise for offering efficient, scalable and automated methods in movement assessment. Here, we describe the development of an infant wearable, a multi-sensor smart jumpsuit that allows mobile data collection during independent movements. A deep learning algorithm, based on convolutional neural networks (CNNs), was then trained using multiple human annotations that incorporate the substantial inherent ambiguity in movement classifications. We also quantify the substantial ambiguity of a human observer, allowing its transfer to improving the automated classifier. Comparison of different sensor configurations and classifier designs shows that four-limb recording and end-toend CNN classifier architecture allows the best movement classification. Our results show that quantitative tracking of independent movement activities is possible with a human equivalent accuracy, i.e. it meets the human inter-rater agreement levels in infant posture and movement classification.
Syllables are often considered to be central to infant and adult speech perception. Many theories and behavioral studies on early language acquisition are also based on syllable-level representations of spoken language. There is little clarity, however, on what sort of pre-linguistic "syllable" would actually be accessible to an infant with no phonological or lexical knowledge. Anchored by the notion that syllables are organized around particularly sonorous (audible) speech sounds, the present study investigates the feasibility of speech segmentation into syllable-like chunks without any a priori linguistic knowledge. We first operationalize sonority as a measurable property of the acoustic input, and then use sonority variation across time, or speech rhythm, as the basis for segmentation. The entire process from acoustic input to chunks of syllable-like acoustic segments is implemented as a computational model inspired by the oscillatory entrainment of the brain to speech rhythm. We analyze the output of the segmentation process in three different languages, showing that the sonority fluctuation in speech is highly informative of syllable and word boundaries in all three cases without any language-specific tuning of the model. These findings support the widely held assumption that syllable-like structure is accessible to infants even when they are only beginning to learn the properties of their native language.
Human infants learn meanings for spoken words in complex interactions with other people, but the exact learning mechanisms are unknown. Among researchers, a widely studied learning mechanism is called cross-situational learning (XSL). In XSL, word meanings are learned when learners accumulate statistical information between spoken words and co-occurring objects or events, allowing the learner to overcome referential uncertainty after having sufficient experience with individually ambiguous scenarios. Existing models in this area have mainly assumed that the learner is capable of segmenting words from speech before grounding them to their referential meaning, while segmentation itself has been treated relatively independently of the meaning acquisition. In this article, we argue that XSL is not just a mechanism for word-to-meaning mapping, but that it provides strong cues for proto-lexical word segmentation. If a learner directly solves the correspondence problem between continuous speech input and the contextual referents being talked about, segmentation of the input into word-like units emerges as a by-product of the learning. We present a theoretical model for joint acquisition of proto-lexical segments and their meanings without assuming a priori knowledge of the language. We also investigate the behavior of the model using a computational implementation, making use of transition probability-based statistical learning. Results from simulations show that the model is not only capable of replicating behavioral data on word learning in artificial languages, but also shows effective learning of word segments and their meanings from continuous speech. Moreover, when augmented with a simple familiarity preference during learning, the model shows a good fit to human behavioral data in XSL tasks. These results support the idea of simultaneous segmentation and meaning acquisition and show that comprehensive models of early word segmentation should take referential word meanings into account. (PsycINFO Database Record
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.