Yohei Oseki scite author profile

A central part of knowing a language is the ability to combine basic linguistic units to form complex representations. While our neurobiological understanding of how words combine into larger structures has significantly advanced in recent years, the combinatory operations that build words themselves remain unknown. Are complex words such as tombstone and starlet built with the same mechanisms that construct phrases from words, such as grey stone or bright star? Here we addressed this with two magnetoencephalography (MEG) experiments, which simultaneously varied demands associated with phrasal composition, and the processing of morphological complexity in compound and suffixed nouns. Replicating previous findings, we show that portions of the left anterior temporal lobe (LATL) are engaged in the combination of modifiers and monomorphemic nouns in phrases (e.g., brown rabbit). As regards compounding, we show that semantically transparent compounds (e.g., tombstone) also engage left anterior temporal cortex, though the spatiotemporal details of this effect differed from phrasal composition. Further, when a phrase was constructed from a modifier and a transparent compound (e.g., granite tombstone), the typical LATL phrasal composition response appeared at a delayed latency, which follows if an initial within-word operation (tomb + stone) must take place before the combination of the compound with the preceding modifier (granite + tombstone). In contrast to compounding, suffixation (i.e., star + let) did not engage the LATL in any consistent way, suggesting a distinct processing route. Finally, our results suggest an intriguing generalization that morpho-orthographic complexity that does not recruit the LATL may block the engagement of the LATL in subsequent phrase building. In sum, our findings offer a detailed spatiotemporal characterization of the lowest level combinatory operations that ultimately feed the composition of full sentences.

show abstract

The reliability of acceptability judgments across languages

Linzen

Oseki

2018

View full text Add to dashboard Cite

The reliability of acceptability judgments made by individual linguists has often been called into question. Recent large-scale replication studies conducted in response to this criticism have shown that the majority of published English acceptability judgments are robust. We make two observations about these replication studies. First, we raise the concern that English acceptability judgments may be more reliable than judgments in other languages. Second, we argue that it is unnecessary to replicate judgments that illustrate uncontroversial descriptive facts; rather, candidates for replication can emerge during formal or informal peer review. We present two experiments motivated by these arguments. Published Hebrew and Japanese acceptability contrasts considered questionable by the authors of the present paper were rated for acceptability by a large sample of naive participants. Approximately half of the contrasts did not replicate. We suggest that the reliability of acceptability judgments, especially in languages other than English, can be improved using a simple open review system, and that formal experiments are only necessary in controversial cases.

show abstract

Cross-linguistic patterns of morpheme order reflect cognitive biases: An experimental study of case and number morphology

Saldana

Oseki

Culbertson

2021

Journal of Memory and Language

View full text Add to dashboard Cite

Lower Perplexity is Not Always Human-Like

Kuribayashi¹,

Oseki²,

Ito³

et al. 2021

View full text Add to dashboard Cite

In computational psycholinguistics, various language models have been evaluated against human reading behavior (e.g., eye movement) to build human-like computational models. However, most previous efforts have focused almost exclusively on English, despite the recent trend towards linguistic universal within the general community. In order to fill the gap, this paper investigates whether the established results in computational psycholinguistics can be generalized across languages. Specifically, we re-examine an established generalization -the lower perplexity a language model has, the more human-like the language model isin Japanese with typologically different structures from English. Our experiments demonstrate that this established generalization exhibits a surprising lack of universality; namely, lower perplexity is not always human-like. Moreover, this discrepancy between English and Japanese is further explored from the perspective of (non-)uniform information density. Overall, our results suggest that a crosslingual evaluation will be necessary to construct human-like computational models.

show abstract

CMCL 2021 Shared Task on Eye-Tracking Prediction

Hollenstein¹,

Chersoni²,

Jacobs³

et al. 2021

View full text Add to dashboard Cite

Eye-tracking data from reading represent an important resource for both linguistics and natural language processing. The ability to accurately model gaze features is crucial to advance our understanding of language processing. This paper describes the Shared Task on Eye-Tracking Data Prediction, jointly organized with the eleventh edition of the Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2021). The goal of the task is to predict 5 different token-level eyetracking metrics from the Zurich Cognitive Language Processing Corpus (ZuCo). Eyetracking data were recorded during natural reading of English sentences. In total, we received submissions from 13 registered teams, whose systems include boosting algorithms with handcrafted features, neural models leveraging transformer language models, or hybrid approaches. The winning system used a range of linguistic and psychometric features in a gradient boosting framework.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yohei Oseki

Building words and phrases in the left temporal lobe

The reliability of acceptability judgments across languages

Cross-linguistic patterns of morpheme order reflect cognitive biases: An experimental study of case and number morphology

Lower Perplexity is Not Always Human-Like

CMCL 2021 Shared Task on Eye-Tracking Prediction

Contact Info

Product

Resources

About