. Random Free Word Order baseline dependency lengths, observed dependency lengths, and optimal dependency lengths for sentences of length 1-50. The blue line shows observed dependency length, the red line shows average dependency length for the random Free Word Order baseline, and the green line shows average dependency length for the optimal baseline. The density of observed dependency lengths is shown in black. The lines in this figure are fit using a generalized additive model. We also give the slopes of dependency length as a function of squared sentence length, as estimated from a mixedeffects regression model. rand is the slope of the random baseline. obs is the slope of the observed dependency lengths. opt is the slope of the optimal baseline. Due to varying sizes of the corpora, some languages (such as Telugu) do not have attested sentences at all sentence lengths.
What determines how languages categorize colors? We analyzed results of the World Color Survey (WCS) of 110 languages to show that despite gross differences across languages, communication of chromatic chips is always better for warm colors (yellows/reds) than cool colors (blues/greens). We present an analysis of color statistics in a large databank of natural images curated by human observers for salient objects and show that objects tend to have warm rather than cool colors. These results suggest that the cross-linguistic similarity in color-naming efficiency reflects colors of universal usefulness and provide an account of a principle (color use) that governs how color categories come about. We show that potential methodological issues with the WCS do not corrupt information-theoretic analyses, by collecting original data using two extreme versions of the colornaming task, in three groups: the Tsimane', a remote Amazonian hunter-gatherer isolate; Bolivian-Spanish speakers; and English speakers. These data also enabled us to test another prediction of the color-usefulness hypothesis: that differences in color categorization between languages are caused by differences in overall usefulness of color to a culture. In support, we found that color naming among Tsimane' had relatively low communicative efficiency, and the Tsimane' were less likely to use color terms when describing familiar objects. Color-naming among Tsimane' was boosted when naming artificially colored objects compared with natural objects, suggesting that industrialization promotes color usefulness.color categorization | information theory | color cognition | Whorfian hypothesis | basic color terms
We performed an exhaustive meta-analysis of 73 peer-reviewed journal articles from the seminal Bock (1986) paper through 2013. Extracting the effect size for each experiment and condition, where the effect size is the log odds ratio of the frequency of the primed structure X to the frequency of the unprimed structure Y, we found a robust effect of syntactic priming with an average weighted odds ratio of 1.67 when there is no lexical overlap and 3.26 when there is. That is, a construction X which occurs 50% of the time in the absence of priming would occur 63% if primed without lexical repetition and 77% of the time if primed with lexical repetition. The syntactic priming effect is robust across several different construction types and languages, and we found strong effects of lexical overlap on the size of the priming effect as well as interactions between lexical repetition and temporal lag and between lexical repetition and whether the priming occurred within or across languages. We also analyzed the distribution of p-values across experiments in order to estimate the average statistical power of experiments in our sample and to assess publication bias. Analyzing a subset of experiments in which the primary result of interest is whether a particular structure showed a priming effect, we did not find evidence of major p-hacking and the studies appear to have acceptable statistical power: 82%. However, analyzing a subset of experiments that focus not just on whether syntactic priming exists but on how syntactic priming is moderated by other variables (such as repetition of words in prime and target, the location of the testing room, the memory of the speaker, etc.), we found that such studies are, on average, underpowered with estimated average power of 53%. Using a subset of 45 papers from our sample for which we received raw data, we estimated subject and item variation and give recommendations for appropriate sample size for future syntactic priming studies.
Cognitive science applies diverse tools and perspectives to study human language. Recently, an exciting body of work has examined linguistic phenomena through the lens of efficiency in usage: what otherwise puzzling features of language find explanation in formal accounts of how language might be optimized for communication and learning? Here, we review studies that deploy formal tools from probability and information theory to understand how and why language works the way that it does, focusing on phenomena ranging from the lexicon through syntax. These studies show how a pervasive pressure for efficiency guides the forms of natural language and indicate that a rich future for language research lies in connecting linguistics to cognitive psychology and mathematical theories of communication and inference.
Language comprehension recruits an extended set of regions in the human brain. Is syntactic processing localized to a particular region or regions within this system, or is it distributed across the entire ensemble of brain regions that support high-level linguistic processing? Evidence from aphasic patients is more consistent with the latter possibility: damage to many different language regions and to white-matter tracts connecting them has been shown to lead to similar syntactic comprehension deficits. However, brain imaging investigations of syntactic processing continue to focus on particular regions within the language system, often parts of Broca’s area and regions in the posterior temporal cortex. We hypothesized that, whereas the entire language system is in fact sensitive to syntactic complexity, the effects in some regions may be difficult to detect because of the overall lower response to language stimuli. Using an individual-subjects approach to localizing the language system, shown in prior work to be more sensitive than traditional group analyses, we indeed find responses to syntactic complexity throughout this system, consistent with the findings from the neuropsychological patient literature. We speculate that such distributed nature of syntactic processing could perhaps imply that syntax is inseparable from other aspects of language comprehension (e.g., lexico-semantic processing), in line with current linguistic and psycholinguistic theories and evidence. Neuroimaging investigations of syntactic processing thus need to expand their scope to include the entire system of high-level language processing regions in order to fully understand how syntax is instantiated in the human brain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.