Cognitive science applies diverse tools and perspectives to study human language. Recently, an exciting body of work has examined linguistic phenomena through the lens of efficiency in usage: what otherwise puzzling features of language find explanation in formal accounts of how language might be optimized for communication and learning? Here, we review studies that deploy formal tools from probability and information theory to understand how and why language works the way that it does, focusing on phenomena ranging from the lexicon through syntax. These studies show how a pervasive pressure for efficiency guides the forms of natural language and indicate that a rich future for language research lies in connecting linguistics to cognitive psychology and mathematical theories of communication and inference.
Recent evidence suggests that cognitive pressures associated with language acquisition and use could affect the organization of the lexicon. On one hand, consistent with noisy channel models of language (e.g., Levy, 2008), the phonological distance between wordforms should be maximized to avoid perceptual confusability (a pressure for dispersion). On the other hand, a lexicon with high phonological regularity would be simpler to learn, remember and produce (e.g., Monaghan et al., 2011) (a pressure for clumpiness). Here we investigate wordform similarity in the lexicon, using measures of word distance (e.g., phonological neighborhood density) to ask whether there is evidence for dispersion or clumpiness of wordforms in the lexicon. We develop a novel method to compare lexicons to phonotactically-controlled baselines that provide a null hypothesis for how clumpy or sparse wordforms would be as the result of only phonotactics. Results for four languages, Dutch, English, German and French, show that the space of monomorphemic wordforms is clumpier than what would be expected by the best chance model according to a wide variety of measures: minimal pairs, average Levenshtein distance and several network properties. This suggests a fundamental drive for regularity in the lexicon that conflicts with the pressure for words to be as phonologically distinct as possible.
Novel words (like tog) that sound like well-known words (dog) are hard for toddlers to learn, even though children can hear the difference between them (Swingley & Aslin, 2007, 2002). One possibility is that phonological competition alone is the problem. Another is that a broader set of probabilistic considerations is responsible: toddlers may resist considering tog as a novel object label because its neighbor dog is also an object. In three experiments, French 18-month-olds were taught novel words whose word forms were phonologically similar to familiar nouns (noun-neighbors), to familiar verbs (verb-neighbors) or to nothing (no-neighbors). Toddlers successfully learned the no-neighbors and verb-neighbors but failed to learn the noun-neighbors, although both novel neighbors had a familiar phonological neighbor in the toddlers’ lexicon. We conclude that when creating a novel lexical entry, toddlers’ evaluation of similarity in the lexicon is multidimensional, incorporating both phonological and semantic or syntactic features.
Upon hearing a novel word, language learners must identify its correct meaning from a diverse set of situationally relevant options. Such referential ambiguity could be reduced through repetitive exposure to the novel word across diverging learning situations, a learning mechanism referred to as cross-situational learning. Previous research has focused on the amount of information learners carry over from 1 learning instance to the next. In the present article, we investigate how context can modulate the learning strategy and its efficiency. Results from 4 cross-situational learning experiments with adults suggest the following: (a) Learners encode more than the specific hypotheses they form about the meaning of a word, providing evidence against the recent view referred to as single hypothesis testing. (b) Learning is faster when learning situations consistently contain members from a given group, regardless of whether this group is a semantically coherent group (e.g., animals) or induced through repetition (objects being presented together repetitively, just like a fork and a door may occur together repetitively in a kitchen). (c) Learners are subject to memory illusions, in a way that suggests that the learning situation itself appears to be encoded in memory during learning. Overall, our findings demonstrate that realistic contexts (such as the situation in which a given word has occurred; e.g., in the zoo or in the kitchen) help learners retrieve or discard potential referents for a word, because such contexts can be memorized and associated with a to-be-learned word.
Although the mapping between form and meaning is often regarded as arbitrary, there are in fact well-known constraints on words which are the result of functional pressures associated with language use and its acquisition. In particular, languages have been shown to encode meaning distinctions in their sound properties, which may be important for language learning. Here, we investigate the relationship between semantic distance and phonological distance in the largescale structure of the lexicon. We show evidence in 100 languages from a diverse array of language families that more semantically similar word pairs are also more phonologically similar.
This study examined whether phrasal prosody can impact toddlers' syntactic analysis. French noun-verb homophones were used to create locally ambiguous test sentences (e.g., using the homophone as a noun: [le bébésouris] [a bien mangé] - [the baby mouse] [ate well] or using it as a verb: [le bébé] [sourità sa maman] - [the baby] [smiles to his mother], where brackets indicate prosodic phrase boundaries). Although both sentences start with the same words (le-bebe-/suʁi/), they can be disambiguated by the prosodic boundary that either directly precedes the critical word /suʁi/ when it is a verb, or directly follows it when it is a noun. Across two experiments using an intermodal preferential looking procedure, 28-month-olds (Exp. 1 and 2) and 20-month-olds (Exp. 2) listened to the beginnings of these test sentences while watching two images displayed side-by-side on a TV-screen: one associated with the noun interpretation of the ambiguous word (e.g., a mouse) and the other with the verb interpretation (e.g., a baby smiling). The results show that upon hearing the first words of these sentences, toddlers were able to correctly exploit prosodic information to access the syntactic structure of sentences, which in turn helped them to determine the syntactic category of the ambiguous word and to correctly identify its intended meaning: participants switched their eye-gaze toward the correct image based on the prosodic condition in which they heard the ambiguous target word. This provides evidence that during the first steps of language acquisition, toddlers are already able to exploit the prosodic structure of sentences to recover their syntactic structure and predict the syntactic category of upcoming words, an ability which would be extremely useful to discover the meaning of novel words.
Even though ambiguous words are common in languages, children find it hard to learn homophones, where a single label applies to several distinct meanings (e.g., Mazzocco, 1997). The present work addresses this apparent discrepancy between learning abilities and typological pattern, with respect to homophony in the lexicon. In a series of five experiments, 20-month-old French children easily learnt a pair of homophones if the two meanings associated with the phonological form belonged to different syntactic categories, or to different semantic categories. However, toddlers failed to learn homophones when the two meanings were distinguished only by different grammatical genders. In parallel, we analyzed the lexicon of four languages, Dutch, English, French and German, and observed that homophones are distributed non-arbitrarily in the lexicon, such that easily learnable homophones are more frequent than hard-to-learn ones: pairs of homophones are preferentially distributed across syntactic and semantic categories, but not across grammatical gender. We show that learning homophones is easier than previously thought, at least when the meanings of the same phonological form are made sufficiently distinct by their syntactic or semantic context. Following this, we propose that this learnability advantage translates into the overall structure of the lexicon, i.e., the kinds of homophones present in languages exhibit the properties that make them learnable by toddlers, thus allowing them to remain in languages.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.