Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP 2021
DOI: 10.18653/v1/2021.blackboxnlp-1.3
|View full text |Cite
|
Sign up to set email alerts
|

Does External Knowledge Help Explainable Natural Language Inference? Automatic Evaluation vs. Human Ratings

Abstract: Natural language inference (NLI) requires models to learn and apply commonsense knowledge. These reasoning abilities are particularly important for explainable NLI systems that generate a natural language explanation in addition to their label prediction. The integration of external knowledge has been shown to improve NLI systems, here we investigate whether it can also improve their explanation capabilities. For this, we investigate different sources of external knowledge and evaluate the performance of our m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
2
1

Relationship

1
4

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 31 publications
0
2
0
Order By: Relevance
“…WordNet, ConceptNet, etc.) in order to improve the explainability of Natural Language Inference (NLI) models [22]. Here we follow a similar approach, carrying out an initial study on the use of language models, particularly GPT-3, as a external source to generate explanations of musical decisions.…”
Section: Transformer-based Approaches In Explainable Aimentioning
confidence: 99%
“…WordNet, ConceptNet, etc.) in order to improve the explainability of Natural Language Inference (NLI) models [22]. Here we follow a similar approach, carrying out an initial study on the use of language models, particularly GPT-3, as a external source to generate explanations of musical decisions.…”
Section: Transformer-based Approaches In Explainable Aimentioning
confidence: 99%
“…Both candidates receive identical BLEU-2 scores; however from a human perspective, sentence (a) seems to much better reflect the original German sentence. a Similarly, automatic evaluation measures used by other NLP tasks face the same problem (Callison-Burch et al 2006;Liu et al 2016;Mathur et al 2020;Schuff et al 2020Schuff et al , 2021Iskender et al 2020;Clinciu et al 2021). Therefore, human evaluation has begun to gain more and more attention in the NLP community (especially in the context of natural language generation tasks, including machine translation Belz and Reiter 2006;Novikova, Dusek, and Rieser 2018;van der Lee et al 2019).…”
Section: Introductionmentioning
confidence: 99%