Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL) 2019
DOI: 10.18653/v1/k19-1019
|View full text |Cite
|
Sign up to set email alerts
|

Diversify Your Datasets: Analyzing Generalization via Controlled Variance in Adversarial Datasets

Abstract: Phenomenon-specific "adversarial" datasets have been recently designed to perform targeted stress-tests for particular inference types. Recent work (Liu et al., 2019a) proposed that such datasets can be utilized for training NLI and other types of models, often allowing to learn the phenomenon in focus and improve on the challenge dataset, indicating a "blind spot" in the original training data. Yet, although a model can improve in such a training process, it might still be vulnerable to other challenge datase… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
29
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 33 publications
(31 citation statements)
references
References 18 publications
0
29
0
Order By: Relevance
“…A related line of work has been analyzing the mathematical reasoning abilities of neural models over text (Wallace et al, 2019;Rozen et al, 2019;Ravichander et al, 2019), and on arithmetic problems (Saxton et al, 2019;Amini et al, 2019;Lample and Charton, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…A related line of work has been analyzing the mathematical reasoning abilities of neural models over text (Wallace et al, 2019;Rozen et al, 2019;Ravichander et al, 2019), and on arithmetic problems (Saxton et al, 2019;Amini et al, 2019;Lample and Charton, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…Regarding assessment of the behavior of modern language models, Linzen et al (2016), Goldberg (2019) investigated their syntactic capabilities by testing such models on subject-verb agreement tasks. Many studies of NLI tasks (Liu et al, 2019;Glockner et al, 2018;Poliak et al, 2018;Tsuchiya, 2018;McCoy et al, 2019;Rozen et al, 2019;Ross and Pavlick, 2019) have provided evaluation methodologies and found that current NLI models often fail on particular inference types, or that they learn undesired heuristics from the training set. In particular, recent works (Yanaka et al, 2019a,b;Richardson et al, 2020) have evaluated models on monotonicity, but did not focus on the ability to generalize to unseen combinations of patterns.…”
Section: Related Workmentioning
confidence: 99%
“…To mitigate these problems, Liu et al (2019a) introduced a systematic task-agnostic method to analyze datasets. Rozen et al (2019) further explain how to improve challenging datasets and why diversity matters. Geva et al (2019) suggest that the training and test data should be from exclusive annotators to avoid annotator bias.…”
Section: Related Workmentioning
confidence: 99%