Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.417
|View full text |Cite
|
Sign up to set email alerts
|

Alignment Rationale for Natural Language Inference

Abstract: Deep learning models have achieved great success on the task of Natural Language Inference (NLI), though only a few attempts try to explain their behaviors. Existing explanation methods usually pick prominent features such as words or phrases from the input text. However, for NLI, alignments among words or phrases are more enlightening clues to explain the model. To this end, this paper presents AREC, a post-hoc approach to generate alignment rationale explanations for co-attention based models in NLI. The exp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 39 publications
0
4
0
Order By: Relevance
“…Ideally, a model should learn rational (Jiang et al, 2021;Lu et al, 2022) features for robust generalization. Take sentiment classification for example.…”
Section: Featuresmentioning
confidence: 99%
“…Ideally, a model should learn rational (Jiang et al, 2021;Lu et al, 2022) features for robust generalization. Take sentiment classification for example.…”
Section: Featuresmentioning
confidence: 99%
“…where ŷ is the predicted label, N is the number of examples, p(ŷ|) is the probability on the predicted class, and x(k) i is modified sample. Higher AOPCs is better, which means that the features chosen by attribution scores are more important; Feng et al Besides these works, a lot of works (Shrikumar et al, 2017;Chen et al, 2019;Nguyen, 2018;DeYoung et al, 2020;Hao et al, 2020;Jiang et al, 2021) use similar metrics to perform evaluation and comparisons. The main difference between evaluation metrics in these works is the difference in the modification strategy.…”
Section: Part II Evaluation 2: Evaluation Based On Meaningful Perturb...mentioning
confidence: 99%
“…Recently, there are many large-scale standard datasets released, like SciTail [Khot Tushar 2018], SNLI [Bowman et al 2015], Multi-NLI [Williams et al 2018], etc. These datasets facilitate the study of NLI greatly, and some state-of-the-art neural models have achieved very competitive performance on these datasets [Belinkov et al 2019;Chen et al 2021b;Jiang et al 2021;Meissner et al 2021;Zhou and Bansal 2020]. From the definition of NLI we can see that it is based on (and assumes) common human understanding of language as well as common background knowledge, thus it has been considered by many as an important evaluation measure for language understanding [Bowman et al 2015;Ido Dagan 2006;Williams et al 2018;Zylberajch et al 2021].…”
Section: Related Workmentioning
confidence: 99%