Driving and suppressing the human language network using large language models

Tuckute, Greta; Sathe, Aalok; Srikant, Shashank; Taliaferro, Maya; Wang, Mingye; Schrimpf, Martin; Kay, Kendrick; Fedorenko, Evelina

doi:10.1101/2023.04.16.537080

Cited by 9 publications

(11 citation statements)

References 250 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The overall predictivity was lower in the RH than the LH language fROIs (p <<0.0001 for all models in Experiment 1 and all checkpoints in Experiment 2), in line with past findings (e.g. (Schrimpf et al, 2021; Tuckute et al, 2023)).…”

Section: Resultssupporting

confidence: 90%

“…The fROIs are identified with an independent localizer, as described in Methods . The general pattern of results (presented in main Figure 2) holds across hemispheres (although predictivity is higher in the LH, in line with other work; e.g., Schrimpf et al, 2021; Tuckute et al, 2023) and fROIs.…”

Section: Supplementary Figuresupporting

confidence: 88%

“…This finding aligns well with earlier work, which showed that surprisal (how predictable a word is from the preceding context), which is closely related to perplexity, is generally predictive of human behavioral responses (e.g.,Smith & Levy, 2013) and neural responses, as estimated with EEG (e.g., Aurnhammer & Frank, 2019; S. L. Frank et al, 2015; Rabovsky et al, 2018), MEG (Brodbeck et al, 2022; Heilbron et al, 2022), fMRI (Brennan et al, 2016; Heilbron et al, 2022; Henderson et al, 2016; Lopopolo et al, 2017; Shain et al, 2020; Willems et al, 2016), or intracranially (Goldstein et al, 2022) during language processing. However, as recently shown in Tuckute et al (2023), representations from language models achieve substantially higher predictivity for fMRI response to sentences than more traditional surprisal metrics based on n-gram counts or PCFG parser probabilities.…”

Section: Discussionmentioning

confidence: 88%

See 2 more Smart Citations

Artificial neural network language models predict human brain responses to language even after a developmentally realistic amount of training

Hosseini

Schrimpf

Zhang

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Artificial neural networks have emerged as computationally plausible models of human language processing. A major criticism of these models is that the amount of training data they receive far exceeds that of humans during language learning. Here, we use two complementary approaches to ask how the models ability to capture human neural and behavioral responses to language is affected by the amount of training data. First, we evaluate GPT-2 models trained on 1 million, 10 million, 100 million, or 1 billion tokens against two fMRI benchmarks and one behavioral (reading times) benchmark. Because children are exposed to approximately 100 million words during the first 10 years of life, we consider the 100 million token model developmentally plausible. Second, we test the performance of a GPT-2 model that is trained on a 9 billion dataset to reach state-of-the-art next-word prediction performance against the same human benchmarks at different stages during training. Across both approaches, we find that (i) the models trained on a developmentally plausible amount of data already achieve near-maximal performance in capturing neural and behavioral responses to language. Further, (ii) lower perplexity (a measure of next-word prediction performance) is associated with stronger alignment with the human benchmarks, suggesting that models that have received enough training to achieve sufficiently high next word prediction performance also acquire human-like representations of the linguistic input. In tandem, these findings establish that although some training is necessary for the models ability to predict human responses to language, a developmentally realistic amount of training (~100 million tokens) may suffice.

show abstract

Section: Resultssupporting

confidence: 90%

Section: Supplementary Figuresupporting

confidence: 88%

Section: Discussionmentioning

confidence: 88%

See 1 more Smart Citation

Artificial neural network language models predict human brain responses to language even after a developmentally realistic amount of training

Hosseini

Schrimpf

Zhang

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…It will also be important to use additional means of model evaluation, such as model-matched stimuli 25, 27, 57 , stimuli optimized for a model’s predicted response 110, 130,131,132 , directly substituting brain responses into models 112 , or recently proposed alternative methods to measure representational similarity 111 . These additional types of evaluations could help address some of the limitations discussed in the previous section.…”

Section: Discussionmentioning

confidence: 99%

Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions

Tuckute

Feather

Boebinger

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Deep neural networks are commonly used as models of the visual system, but are less explored in audition. Prior work provided examples of audio-trained neural networks that produced good predictions of auditory cortical fMRI responses and exhibited correspondence between model stages and brain regions, but left it unclear whether these results generalize to other neural network models. We evaluated brain-model correspondence for publicly available audio neural network models along with in-house models trained on four different tasks. Most tested models out-predicted previous filter-bank models of auditory cortex, and exhibited systematic model-brain correspondence: middle stages best predicted primary auditory cortex while deep stages best predicted non-primary cortex. However, some state-of-the-art models produced substantially worse brain predictions. The training task influenced the prediction quality for specific cortical tuning properties, with best overall predictions resulting from models trained on multiple tasks. The results suggest the importance of task optimization in constraining brain representations.

show abstract

“…While the naturalistic nature of these stimuli means that we did not necessarily have repeated presentation of the same word(s) across stories, we can use natural language processing (NLP) techniques to group words into clusters of semantically related words and use the clusters to help understand why concrete representations are more reliable, even when generalizing over individual words and concepts. Numerous recent studies have demonstrated parallels in language representation between NLP models and human neural processing 13,[64][65][66][67] . Here, we used a word-embedding NLP model (GloVe) 52 to understand how the semantic relationships among concrete and abstract words relate to the reliability of their neural representations.…”

Section: Stable Clusters Of Concrete Words Drive Reliability Of Repre...mentioning

confidence: 99%

Neural representations of concrete concepts enable identification of individuals during naturalistic story listening

Botch,

Finn

2023

Preprint

View full text Add to dashboard Cite

Different people listening to the same story may converge upon a largely shared interpretation while still developing idiosyncratic experiences atop that shared foundation. What semantic properties support this individualized experience of natural language? Here, we investigate how the “concreteness” of word meanings — i.e., the extent to which a concept is derived from sensory experience — relates to variability in the neural representations of language. Leveraging a large dataset of participants who each listened to four auditory stories while undergoing functional MRI, we demonstrate that an individual’s neural representations of concrete concepts are reliable across stories and unique to the individual. In contrast, we find that neural representations of abstract concepts are variable both within individuals and across the population. Using natural language processing tools, we show that concrete words exhibit similar neural signatures despite spanning larger distances within a high-dimensional semantic space, which potentially reflects an underlying signature of sensory experience — namely, imageability — shared by concrete words but absent from abstract words. Our findings situate the concrete-abstract semantic axis as a core dimension that supports reliable yet individualized representations of natural language.

show abstract

Driving and suppressing the human language network using large language models

Cited by 9 publications

References 250 publications

Artificial neural network language models predict human brain responses to language even after a developmentally realistic amount of training

Artificial neural network language models predict human brain responses to language even after a developmentally realistic amount of training

Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions

Neural representations of concrete concepts enable identification of individuals during naturalistic story listening

Contact Info

Product

Resources

About