Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP 2018
DOI: 10.18653/v1/w18-5426
|View full text |Cite
|
Sign up to set email alerts
|

Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information

Abstract: How do neural language models keep track of number agreement between subject and verb? We show that 'diagnostic classifiers', trained to predict number from the internal states of a language model, provide a detailed understanding of how, when, and where this information is represented. Moreover, they give us insight into when and where number information is corrupted in cases where the language model ends up making agreement errors. To demonstrate the causal role played by the representations we find, we then… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
116
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 147 publications
(128 citation statements)
references
References 11 publications
3
116
0
1
Order By: Relevance
“…Ettinger et al (2016Ettinger et al ( , 2017; Zhu et al (2018), i.a., use a task-based approach similar to ours, where tasks that require a specific subset of linguistic knowledge are used to perform qualitative evaluation. Gulordava et al (2018), Giulianelli et al (2018), Rønning et al (2018), and Jumelet and Hupkes (2018) make a focused contribution towards a particular linguistic phenomenon (agreement, ellipsis, negative polarity). Using recast NLI, Poliak et al (2018a) probe for semantic phenomena in neural machine translation encoders.…”
Section: Related Workmentioning
confidence: 99%
“…Ettinger et al (2016Ettinger et al ( , 2017; Zhu et al (2018), i.a., use a task-based approach similar to ours, where tasks that require a specific subset of linguistic knowledge are used to perform qualitative evaluation. Gulordava et al (2018), Giulianelli et al (2018), Rønning et al (2018), and Jumelet and Hupkes (2018) make a focused contribution towards a particular linguistic phenomenon (agreement, ellipsis, negative polarity). Using recast NLI, Poliak et al (2018a) probe for semantic phenomena in neural machine translation encoders.…”
Section: Related Workmentioning
confidence: 99%
“…For example, using linear classifiers at each time point during sentence processing, information represented by various units can be decoded, and thus provide evidence about their processing function. Using such ‘Diagnostic Classifiers’ [ 38 ], Giulianelli et al [ 39 ] explored whether grammatical-number information can be decoded from transient state of a neural network. This approach proved beneficial: in particular, it revealed that the representation of grammatical number, as in (5), is mostly stored by the highest layer of the network, and that it can be robustly maintained in network activity.…”
Section: Understanding Capacity Limitation In Light Of Neural Langmentioning
confidence: 99%
“…The strong performance of recurrent neural networks (RNNs) in applied natural language processing tasks has motivated an array of studies that have investigated their ability to acquire natural language syntax without syntactic annotations; these studies have identified both strengths (Linzen et al, 2016;Giulianelli et al, 2018;Gulordava et al, 2018;Kuncoro et al, 2018;van Schijndel and Linzen, 2018;) and limitations (Chowdhury and Zamparelli, 2018;Marvin and Linzen, 2018;.…”
Section: Introductionmentioning
confidence: 99%