Successful language acquisition hinges on organizing individual words into grammatical categories and learning the relationships between them, but the method by which children accomplish this task has been debated in the literature. One proposal is that learners use the shared distributional contexts in which words appear as a cue to their underlying category structure. Indeed, recent research using artificial languages has demonstrated that learners can acquire grammatical categories from this type of distributional information. However, artificial languages are typically composed of a small number of equally frequent words, while words in natural languages vary widely in frequency, complicating the distributional information needed to determine categorization. In a series of three experiments we demonstrate that distributional learning is preserved in an artificial language composed of words that vary in frequency as they do in natural language, along a Zipfian distribution. Rather than depending on the absolute frequency of words and their contexts, the conditional probabilities that words will occur in certain contexts (given their base frequency) is a better basis for assigning words to categories; and this appears to be the type of statistic that human learners utilize.
During language acquisition, children must learn when to generalize a pattern – applying it broadly and to new words (‘add –ed’ in English) – and when to restrict generalization, storing the pattern only with specific lexical items. But what governs when children will form productive rules during language acquisition? How do they determine when a pattern is widespread enough to generalize to novel words, and when a pattern should not extend beyond the cases they have observed in their input? One effort to quantify the conditions for generalization, the Tolerance Principle (Yang, 2016), has been shown to accurately predict children’s generalization behavior in dozens of corpus-based studies. The Tolerance Principle hypothesizes that a general rule will be formed when it is computationally more efficient than storing lexical forms individually. Here we test the Tolerance Principle in two artificial language experiments with children. In both experiments, we exposed children to a language with 9 novel nouns, some of which followed a regular pattern to form the plural (-ka) and some of which were exceptions to this rule. As predicted by the Tolerance Principle, in Experiment 1 we found that children exposed to 5 regular forms and 4 exceptions generalized, applying the regular form to 100% of novel test words. Children exposed to 3 regular forms and 6 exceptions did not extend the rule, even though the regular form was still the majority token in this condition. In Experiment 2, we found that children continued to behave categorically: either forming a productive rule (applying the regular form on all test trials) or using the regular form no more than predicted by chance. We found that the Tolerance Principle can be used to predict whether children will form a productive generalization or not based on each child’s individual vocabulary size. The Tolerance Principle appears to capture something fundamental about the way in which children form productive generalizations during language acquisition.
Artificial language learning methods—in which learners are taught miniature constructed languages in a controlled laboratory setting—have become a valuable experimental tool for research on language development. These methods offer a complement to natural language acquisition data, allowing researchers to control both the input to learning and the learning environment. A large proportion of artificial language learning studies has aimed to understand the mechanisms of learning in infants. This review focuses instead on investigations into the nature of early linguistic representations and how they are influenced by both the structure of the input and the cognitive features of the learner. Looking not only at young infants but also at children beyond infancy, we discuss evidence for early abstraction, conditions on generalization, the acquisition of grammatical categories and dependencies, and recent work connecting the cognitive biases of learners to language typology. We end by outlining important areas for future research.
Language learners must place unfamiliar words into categories, often with few explicit indicators about when and how that word can be used grammatically. Reeder, Newport, and Aslin (2013) showed that college students can learn grammatical form classes from an artificial language by relying solely on distributional information (i.e., contextual cues in the input). Here, two experiments revealed that healthy older adults also show such statistical learning, though they are poorer than young at distinguishing grammatical from ungrammatical strings. This finding expands knowledge of which aspects of learning vary with aging, with potential implications for second language learning in late adulthood.
When linguistic input contains inconsistent use of grammatical forms, children produce these forms more consistently, a process called "regularization." Deaf children learning American Sign Language from parents who are non-native users of the language regularize their parents' inconsistent usages. In studies of artificial languages containing inconsistently used morphemes, children, but not adults, regularized these forms. However, little is known about the precise circumstances in which such regularization occurs.In three experiments we investigate how the type of input variation and the age of learners affects regularization. Overall our results suggest that while adults tend to reproduce the inconsistencies found in their input, young children introduce regularity: they learn varying forms whose occurrence is conditioned and systematic, but they alter inconsistent variation to be more regular. Older children perform more like adults, suggesting that regularization changes with maturation and cognitive capacities.Learning a language is a daunting task. With little to no explicit instruction, young children must extract the relevant parts of the speech stream and learn the allowable combinations of sounds, words, and sentences. A central question is what biases and abilities allow children to learn a language so successfully. One mechanism hypothesized as part of this task is a sensitivity to the distributional statistics of the language environment. By tracking various statistics concerning the frequencies and co-occurrences of linguistic elements, young language learners can extract regularities at many levels, including phonology (
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.