Can RNNs learn Recursive Nested Subject-Verb Agreements?

Lakretz, Yair; Desbordes, Théo; King, Jean-Rémi; Crabbé, Benoît; Oquab, Maxime; Dehaene, Stanislas

doi:10.48550/arxiv.2101.02258

Cited by 6 publications

(11 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, what these features actually represent remains largely unknown. Previous studies have shown that language transformers explicitly represent syntactic 14,52 and semantic features 14 . Similarly, Manning et al showed that syntactic trees appear to be encoded by the distances between contextualized word embedding 52 .…”

Section: Discussionmentioning

confidence: 99%

“…This limit is expected: several studies demonstrate that current deep language models fail to capture several aspects critical to comprehension 16,19 : they (i) often fail to generalize beyond the training distribution 56 , (ii) do not perfectly capture deep syntactic structures 14,52 and (iii) remain relatively poor at summarizing texts, generating stories and answering questions [20][21][22] . Furthermore, GPT-2 is only trained with textual data and does not situate objects in a grounded environment that would capture their real-world interactions 18,57 .…”

Section: Discussionmentioning

confidence: 99%

“…Recent studies suggest that they partially do: the hidden representations of various deep neural networks have shown to linearly predict single-sample fMRI [4][5][6][7][8][9][10][11] MEG 5,7 , and intracranial responses to spoken and written texts 6,12 .However, whether these models encode, retrieve and pay attention to information that specifically relates to behavior in general, and to comprehension in particular remains controversial [13][14][15][16][17][18][19] . This issue is all-the-more relevant that the behavior of deep language models remains challenged by complex questions, including subjectverb agreement 14,15,17 , causal reasoning 16,19 , story generation, text summarization as well as dialogue and question answering [20][21][22][23][24] .To explore the relationship between comprehension and the representations of GPT-2, we compare GPT-2's activations to the functional Magnetic Resonance Imaging of 101 subjects listening to 70min of seven short stories. We first quantify this similarity with a "brain score" (M) 25,26 .…”

mentioning

confidence: 99%

“…However, whether these models encode, retrieve and pay attention to information that specifically relates to behavior in general, and to comprehension in particular remains controversial [13][14][15][16][17][18][19] . This issue is all-the-more relevant that the behavior of deep language models remains challenged by complex questions, including subjectverb agreement 14,15,17 , causal reasoning 16,19 , story generation, text summarization as well as dialogue and question answering [20][21][22][23][24] .…”

mentioning

confidence: 99%

See 3 more Smart Citations

Deep language algorithms predict semantic comprehension from brain activity

Caucheteux

Gramfort

2022

Sci Rep

Self Cite

View full text Add to dashboard Cite

Deep language algorithms, like GPT-2, have demonstrated remarkable abilities to process text, and now constitute the backbone of automatic translation, summarization and dialogue. However, whether these models encode information that relates to human comprehension still remains controversial. Here, we show that the representations of GPT-2 not only map onto the brain responses to spoken stories, but they also predict the extent to which subjects understand the corresponding narratives. To this end, we analyze 101 subjects recorded with functional Magnetic Resonance Imaging while listening to 70 min of short stories. We then fit a linear mapping model to predict brain activity from GPT-2’s activations. Finally, we show that this mapping reliably correlates ($$\mathcal {R}=0.50, p<10^{-15}$$ R = 0.50 , p < 10 - 15 ) with subjects’ comprehension scores as assessed for each story. This effect peaks in the angular, medial temporal and supra-marginal gyri, and is best accounted for by the long-distance dependencies generated in the deep layers of GPT-2. Overall, this study shows how deep language models help clarify the brain computations underlying language comprehension.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

mentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

Deep language algorithms predict semantic comprehension from brain activity

Caucheteux

Gramfort

2022

Sci Rep

Self Cite

View full text Add to dashboard Cite

show abstract

“…Yet, a major gap remains between humans and these algorithms: current language models are still poor at story generation and summarization as well as dialogue and question answering (10)(11)(12)(13)(14); they fail to capture many syntactic constructs and semantics properties (15)(16)(17)(18)(19), and their linguistic understanding is often superficial (16,(18)(19)(20).…”

mentioning

confidence: 99%

Long-range and hierarchical language predictions in brains and algorithms

Caucheteux¹,

Gramfort²,

J³

2021

Preprint

View full text Add to dashboard Cite

Deep learning has recently made remarkable progress in natural language processing. Yet, the resulting algorithms remain far from competing with the language abilities of the human brain. Predictive coding theory offers a potential explanation to this discrepancy: while deep language algorithms are optimized to predict adjacent words, the human brain would be tuned to make long-range and hierarchical predictions. To test this hypothesis, we analyze the fMRI brain signals of 304 subjects each listening to ≈70 min of short stories. After confirming that the activations of deep language algorithms linearly map onto those of the brain, we show that enhancing these models with long-range forecast representations improves their brainmapping. The results further reveal a hierarchy of predictions in the brain, whereby the fronto-parietal cortices forecast more abstract and more distant representations than the temporal cortices. Overall, this study strengthens predictive coding theory and suggests a critical role of long-range and hierarchical predictions in natural language processing.

show abstract

Evidence of a predictive coding hierarchy in the human brain listening to speech

Caucheteux

Gramfort

2023

Nat Hum Behav

Self Cite

View full text Add to dashboard Cite

Considerable progress has recently been made in natural language processing: deep learning algorithms are increasingly able to generate, summarize, translate and classify texts. Yet, these language models still fail to match the language abilities of humans. Predictive coding theory offers a tentative explanation to this discrepancy: while language models are optimized to predict nearby words, the human brain would continuously predict a hierarchy of representations that spans multiple timescales. To test this hypothesis, we analysed the functional magnetic resonance imaging brain signals of 304 participants listening to short stories. First, we confirmed that the activations of modern language models linearly map onto the brain responses to speech. Second, we showed that enhancing these algorithms with predictions that span multiple timescales improves this brain mapping. Finally, we showed that these predictions are organized hierarchically: frontoparietal cortices predict higher-level, longer-range and more contextual representations than temporal cortices. Overall, these results strengthen the role of hierarchical predictive coding in language processing and illustrate how the synergy between neuroscience and artificial intelligence can unravel the computational bases of human cognition.

show abstract

Can RNNs learn Recursive Nested Subject-Verb Agreements?

Cited by 6 publications

References 19 publications

Deep language algorithms predict semantic comprehension from brain activity

Deep language algorithms predict semantic comprehension from brain activity

Long-range and hierarchical language predictions in brains and algorithms

Evidence of a predictive coding hierarchy in the human brain listening to speech

Contact Info

Product

Resources

About