Findings of the Association for Computational Linguistics: EMNLP 2020 2020
DOI: 10.18653/v1/2020.findings-emnlp.313
|View full text |Cite
|
Sign up to set email alerts
|

Mining Knowledge for Natural Language Inference from Wikipedia Categories

Abstract: Accurate lexical entailment (LE) and natural language inference (NLI) often require large quantities of costly annotations. To alleviate the need for labeled data, we introduce WIKINLI: a resource for improving model performance on NLI and LE tasks. It contains 428,899 pairs of phrases constructed from naturally annotated category hierarchies in Wikipedia. We show that we can improve strong baselines such as BERT (Devlin et al., 2019) and RoBERTa (Liu et al., 2019) by pretraining them on WIKINLI and transfe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 41 publications
0
3
0
Order By: Relevance
“…In future work, we want to combine efficiency with other highly desirable properties of evaluation metrics such as robustness (Vu et al, 2022;Chen and Eger, 2023;Rony et al, 2022) and explainability (Kaster et al, 2021;Sai et al, 2021;Fomicheva et al, 2021;Leiter et al, 2022) to induce metrics that jointly satisfy these criteria.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In future work, we want to combine efficiency with other highly desirable properties of evaluation metrics such as robustness (Vu et al, 2022;Chen and Eger, 2023;Rony et al, 2022) and explainability (Kaster et al, 2021;Sai et al, 2021;Fomicheva et al, 2021;Leiter et al, 2022) to induce metrics that jointly satisfy these criteria.…”
Section: Discussionmentioning
confidence: 99%
“…Evaluation metrics: Recent transformerbased metrics utilize BERT-based models like BERTScore (Zhang et al, 2020) and Mover-Score (Zhao et al, 2019). Extensions include BARTScore (Yuan et al, 2021), which reads off probability estimates as metric scores directly from text generation systems, and MENLI (Chen and Eger, 2023), which uses probabilities from models fine-tuned on Natural Language Inference task. These metrics are reference-based (comparing the MT output to a human reference), like BERTScore and MoverScore, or reference-free (comparing the MT output to the source text), like XMover-Score (Zhao et al, 2020) and SentSim (Song et al, 2021), and some are trained (fine-tuned on human scores) like COMET (Rei et al, 2020) while others are untrained, like BERTScore.…”
Section: Related Workmentioning
confidence: 99%
“…It considers contextual information and semantic similarity, providing a more nuanced and accurate evaluation of summary quality (Zhang et al, 2019). Chen and Eger (2023), introduces a novel approach by advocating the direct utilization of pretrained Natural Language Inference (NLI) models as evaluation metrics. Furthermore, they developed a novel preference-based adversarial test suite for machine translation and summarization metrics.…”
Section: Related Workmentioning
confidence: 99%
“…finds that utilizing ConceptNet as an external knowledge source can benefit entailment model in scientific domain. Chen et al (2020b) proposes WIKINLI, a large-scale naturally annotated dataset constructed from Wikipedia category graph. And they show that model pretrained on this dataset can achieve better performance on downstream natural language entailment tasks.…”
Section: Modeling External Knowledge In Nlpmentioning
confidence: 99%