Reconstructing the cascade of language processing in the brain using the internal computations of transformer language models

Kumar, Sreejan; Sumers, Theodore R.; Yamakoshi, Takateru; Goldstein, Ariel; Hasson, Uri; Norman, Kenneth A.; Griffiths, Thomas L.; Hawkins, Robert D.; Nastase, Samuel A.

doi:10.32470/ccn.2022.1255-0

Cited by 14 publications

(13 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Probing the nature of these fits further, they find the best model-to-brain match obtains with the middle layers of the LLMs (e.g., layers 8-9 of a 12-layer feed-forward transformer network) rather than input or output layers. Kumar et al (2022) report similar results for fMRI data collected while participants listen to narratives.…”

Section: Predictability Processing and Llmssupporting

confidence: 67%

“…Kumar et al. (2022) report similar results for fMRI data collected while participants listen to narratives.…”

Section: Introductionmentioning

confidence: 59%

“…Our analyses do not include predictors derived from the internal state of an LLM (cf. Caucheteux & King, 2022; Kumar et al., 2022; Schrimpf et al., 2021). That choice is driven by a focus on interpretability and parsimony in our modeling efforts; estimators for structural complexity are derived from the internal states of psycholinguistically plausible parsing models by counting parser steps.…”

Section: Discussionmentioning

confidence: 99%

“…) offers an unprecedented tool to capture these usagerelated factors. Indeed, the apparent match between the outputs of these models and human neural signals suggests they offer a very strong baseline for isolating neural responses that reflect statistical patterns alone (Caucheteux & King, 2022;Caucheteux, Gramfort, & King, 2023;Goldstein et al, 2022;Heilbron, Armeni, Schoffelen, Hagoort, & de Lange, 2022;Kumar et al, 2022;Schrimpf et al, 2021).…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Modeling Structure‐Building in the Brain With CCG Parsing and Large Language Models

et al. 2023

View full text Add to dashboard Cite

To model behavioral and neural correlates of language comprehension in naturalistic environments, researchers have turned to broad‐coverage tools from natural‐language processing and machine learning. Where syntactic structure is explicitly modeled, prior work has relied predominantly on context‐free grammars (CFGs), yet such formalisms are not sufficiently expressive for human languages. Combinatory categorial grammars (CCGs) are sufficiently expressive directly compositional models of grammar with flexible constituency that affords incremental interpretation. In this work, we evaluate whether a more expressive CCG provides a better model than a CFG for human neural signals collected with functional magnetic resonance imaging (fMRI) while participants listen to an audiobook story. We further test between variants of CCG that differ in how they handle optional adjuncts. These evaluations are carried out against a baseline that includes estimates of next‐word predictability from a transformer neural network language model. Such a comparison reveals unique contributions of CCG structure‐building predominantly in the left posterior temporal lobe: CCG‐derived measures offer a superior fit to neural signals compared to those derived from a CFG. These effects are spatially distinct from bilateral superior temporal effects that are unique to predictability. Neural effects for structure‐building are thus separable from predictability during naturalistic listening, and those effects are best characterized by a grammar whose expressive power is motivated on independent linguistic grounds.

show abstract

Section: Predictability Processing and Llmssupporting

confidence: 67%

“…Kumar et al. (2022) report similar results for fMRI data collected while participants listen to narratives.…”

Section: Introductionmentioning

confidence: 59%

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Modeling Structure‐Building in the Brain With CCG Parsing and Large Language Models

et al. 2023

View full text Add to dashboard Cite

show abstract

“…In such a mapping, early language areas will be better modeled by embedding extracted from early layers of DLMs, whereas higher-order areas will be better modeled by embeddings extracted from later layers of DLMs. Interestingly, studies that examined the layer-by-layer match between DLM embeddings and brain activity using fMRI have observed that intermediate layers tend to provide the best fit across many language ROIs (3, 15, 30, 31). These findings do not support the hypothesis that DLMs capture the processing sequence of words in natural language in the human brain.…”

Section: Introductionmentioning

confidence: 99%

Correspondence between the layered structure of deep language models and temporal structure of natural language processing in the human brain

Goldstein

Ham

Nastase

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Deep language models (DLMs) provide a novel computational paradigm for how the brain processes natural language. Unlike symbolic, rule-based models from psycholinguistics, DLMs encode words and their context as continuous numerical vectors. These "embeddings" are constructed by a sequence of layered computations to ultimately capture surprisingly sophisticated representations of linguistic structures. How does this layered hierarchy map onto the human brain during natural language comprehension? In this study, we used ECoG to record neural activity in language areas along the superior temporal gyrus and inferior frontal gyrus while human participants listened to a 30-minute spoken narrative. We supplied this same narrative to a high-performing DLM (GPT2-XL) and extracted the contextual embeddings for each word in the story across all 48 layers of the model. We next trained a set of linear encoding models to predict the temporally-evolving neural activity from the embeddings at each layer. We found a striking correspondence between the layer-by-layer sequence of embeddings from GPT2-XL and the temporal sequence of neural activity in language areas. In addition, we found evidence for the gradual accumulation of recurrent information along the linguistic processing hierarchy. However, we also noticed additional neural processes that took place in the brain, but not in DLMs, during the processing of surprising (unpredictable) words. These findings point to a connection between language processing in humans and DLMs where the layer-by-layer accumulation of contextual information in DLM embeddings matches the temporal dynamics of neural activity in high-order language areas.

show abstract

Understanding the brain with attention: A survey of transformers in brain sciences

Chen,

Wang,

Chen

et al. 2023

Brain-X

View full text Add to dashboard Cite

Owing to their superior capabilities and advanced achievements, Transformers have gradually attracted attention with regard to understanding complex brain processing mechanisms. This study aims to comprehensively review and discuss the applications of Transformers in brain sciences. First, we present a brief introduction of the critical architecture of Transformers. Then, we overview and analyze their most relevant applications in brain sciences, including brain disease diagnosis, brain age prediction, brain anomaly detection, semantic segmentation, multi‐modal registration, functional Magnetic Resonance Imaging (fMRI) modeling, Electroencephalogram (EEG) processing, and multi‐task collaboration. We organize the model details and open sources for reference and replication. In addition, we discuss the quantitative assessments, model complexity, and optimization of Transformers, which are topics of great concern in the field. Finally, we explore possible future challenges and opportunities, exploiting some concrete and recent cases to provoke discussion and innovation. We hope that this review will stimulate interest in further research on Transformers in the context of brain sciences.

show abstract

Reconstructing the cascade of language processing in the brain using the internal computations of transformer language models

Cited by 14 publications

References 0 publications

Modeling Structure‐Building in the Brain With CCG Parsing and Large Language Models

Modeling Structure‐Building in the Brain With CCG Parsing and Large Language Models

Correspondence between the layered structure of deep language models and temporal structure of natural language processing in the human brain

Understanding the brain with attention: A survey of transformers in brain sciences

Contact Info

Product

Resources

About