2021
DOI: 10.48550/arxiv.2101.02258
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Can RNNs learn Recursive Nested Subject-Verb Agreements?

Yair Lakretz,
Théo Desbordes,
Jean-Rémi King
et al.

Abstract: One of the fundamental principles of contemporary linguistics states that language processing requires the ability to extract recursively nested tree structures. However, it remains unclear whether and how this code could be implemented in neural circuits. Recent advances in Recurrent Neural Networks (RNNs), which achieve near-human performance in some language tasks, provide a compelling model to address such questions. Here, we present a new framework to study recursive processing in RNNs, using subject-verb… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
10
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(11 citation statements)
references
References 19 publications
0
10
0
Order By: Relevance
“…However, what these features actually represent remains largely unknown. Previous studies have shown that language transformers explicitly represent syntactic 14,52 and semantic features 14 . Similarly, Manning et al showed that syntactic trees appear to be encoded by the distances between contextualized word embedding 52 .…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…However, what these features actually represent remains largely unknown. Previous studies have shown that language transformers explicitly represent syntactic 14,52 and semantic features 14 . Similarly, Manning et al showed that syntactic trees appear to be encoded by the distances between contextualized word embedding 52 .…”
Section: Discussionmentioning
confidence: 99%
“…This limit is expected: several studies demonstrate that current deep language models fail to capture several aspects critical to comprehension 16,19 : they (i) often fail to generalize beyond the training distribution 56 , (ii) do not perfectly capture deep syntactic structures 14,52 and (iii) remain relatively poor at summarizing texts, generating stories and answering questions [20][21][22] . Furthermore, GPT-2 is only trained with textual data and does not situate objects in a grounded environment that would capture their real-world interactions 18,57 .…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Yet, a major gap remains between humans and these algorithms: current language models are still poor at story generation and summarization as well as dialogue and question answering (10)(11)(12)(13)(14); they fail to capture many syntactic constructs and semantics properties (15)(16)(17)(18)(19), and their linguistic understanding is often superficial (16,(18)(19)(20).…”
mentioning
confidence: 99%