2022
DOI: 10.1038/s41593-022-01026-4
|View full text |Cite
|
Sign up to set email alerts
|

Shared computational principles for language processing in humans and deep language models

Abstract: Departing from traditional linguistic models, advances in deep learning have resulted in a new type of predictive (autoregressive) deep language models (DLMs). Using a self-supervised next-word prediction task, these models generate appropriate linguistic responses in a given context. In the current study, nine participants listened to a 30-min podcast while their brain responses were recorded using electrocorticography (ECoG). We provide empirical evidence that the human brain and autoregressive DLMs share th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

10
217
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 209 publications
(295 citation statements)
references
References 63 publications
10
217
1
Order By: Relevance
“…Given that we were inspired by work in cognitive psychology on human use of explanations (e.g. Ahn et al, 1992;Lombrozo and Carey, 2006), and given the accumulating evidence that language models predict language processing in the human brain to a surprising degree (Goldstein et al, 2022;Schrimpf et al, 2021), it is natural to ask whether there are cognitive implications of our experiments. However, as we noted above, the fact that both language models and humans benefit from explanations does not imply that they necessarily benefit through the same mechanisms.…”
Section: How Do Explanations Relate To Task Instructions?mentioning
confidence: 99%
“…Given that we were inspired by work in cognitive psychology on human use of explanations (e.g. Ahn et al, 1992;Lombrozo and Carey, 2006), and given the accumulating evidence that language models predict language processing in the human brain to a surprising degree (Goldstein et al, 2022;Schrimpf et al, 2021), it is natural to ask whether there are cognitive implications of our experiments. However, as we noted above, the fact that both language models and humans benefit from explanations does not imply that they necessarily benefit through the same mechanisms.…”
Section: How Do Explanations Relate To Task Instructions?mentioning
confidence: 99%
“…Unlike convolutional neural networks, whose architectural design principles are roughly inspired by biological vision [Lindsay, 2021], the design of current neural network language models is largely uninformed by psycholinguistics and neuroscience. And yet, there is an ongoing effort to adopt and adapt neural network language models to serve as computational hypotheses of how humans process language, making use of a variety of different architectures, training corpora, and training tasks [e.g., Wehbe et al, 2014, Toneva and Wehbe, 2019, Heilbron et al, 2020, Jain et al, 2020, Lyu et al, 2021, Schrimpf et al, 2021, Wilcox et al, 2021, Goldstein et al, 2022, Caucheteux and King, 2022. We found that recurrent neural networks make markedly human-inconsistent predictions once pitted against transformer-based neural networks.…”
Section: Implications For Artificial Neural Network Language Models A...mentioning
confidence: 89%
“…Theory of Mind is a central facet of human intelligence [9][10][11][18][19][20]. Inspired by the success of DL in understanding biological vision [1][2][3][4][5] and language processing [6][7][8], over the last years a challenge has emerged to develop DL agents that can mimic aspects of ToM.…”
Section: Discussionmentioning
confidence: 99%
“…Rapid advances in deep learning (DL) have led to human-level performance on certain visual recognition and natural language processing tasks. Moreover, research has revealed shared computational principles in humans and DL models for vision [1][2][3][4][5] and language processing [6][7][8]. These findings do not imply that DL has fully captured how these processes operate in the human brain, but DL has definitely contributed to better characterizing the computational principles underlying them.…”
Section: Introductionmentioning
confidence: 99%