FEVER: a Large-scale Dataset for Fact Extraction and VERification

Thorne, James; Vlachos, Andreas; Christodoulopoulos, Christos; Mittal, Arpit

doi:10.18653/v1/n18-1074

Cited by 807 publications

(1,111 citation statements)

References 16 publications

Supporting

Mentioning

994

Contrasting

Unclassified

Order By: Relevance

“…Some use additional Twitter-specific features (Enayet and El-Beltagy, 2017). More involved methods taking into account evidence documents, often trained on larger datasets, consist of evidence identification and ranking following a neural model that measures the compatibility between claim and evidence (Thorne et al, 2018;Mihaylova et al, 2018;Yin and Roth, 2018).…”

Section: Methodsmentioning

confidence: 99%

MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims

Augenstein¹,

Lioma²,

Wang³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

145

166

View full text Add to dashboard Cite

We contribute the largest publicly available dataset of naturally occurring factual claims for the purpose of automatic claim verification. It is collected from 26 fact checking websites in English, paired with textual sources and rich metadata, and labelled for veracity by human expert journalists. We present an in-depth analysis of the dataset, highlighting characteristics and challenges. Further, we present results for automatic veracity prediction, both with established baselines and with a novel method for joint ranking of evidence pages and predicting veracity that outperforms all baselines. Significant performance increases are achieved by encoding evidence, and by modelling metadata. Our best-performing model achieves a Macro F1 of 49.2%, showing that this is a challenging testbed for claim veracity prediction.

show abstract

Section: Methodsmentioning

confidence: 99%

MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims

Augenstein¹,

Lioma²,

Wang³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

145

166

View full text Add to dashboard Cite

show abstract

“…In the FEVER benchmark (Thorne et al, 2018), the DrQA (Chen et al, 2017) retrieval component is considered as the baseline. They choose the k-nearest documents based on the cosine similarity of TF-IDF feature vectors.…”

Section: Document Retrievalmentioning

confidence: 99%

“…In order to extract evidence sentences, (Thorne et al, 2018) use a TF-IDF approach similar to their document retrieval. The UCL team (Yoneda et al, 2018) trains a logistic regression model on a heuristically set of features.…”

Section: Sentence Retrievalmentioning

confidence: 99%

See 1 more Smart Citation

BERT for Evidence Retrieval and Claim Verification

Soleimani

Monz

Worring

2020

Lecture Notes in Computer Science

118

View full text Add to dashboard Cite

Motivated by the promising performance of pre-trained language models, we investigate BERT in an evidence retrieval and claim verification pipeline for the FEVER fact extraction and verification challenge. To this end, we propose to use two BERT models, one for retrieving potential evidence sentences supporting or rejecting claims, and another for verifying claims based on the predicted evidence sets. To train the BERT retrieval system, we use pointwise and pairwise loss functions, and examine the effect of hard negative mining. A second BERT model is trained to classify the samples as supported, refuted, and not enough information. Our system achieves a new state of the art recall of 87.1 for retrieving top five sentences out of the FEVER documents consisting of 50K Wikipedia pages, and scores second in the official leaderboard with the FEVER score of 69.7.

show abstract

“…The rise of social media has enabled the phenomenon of "fake news," which could target specific individuals and can be used for deceptive purposes (Lazer et al, 2018;Vosoughi et al, 2018). As manual fact-checking is a time-consuming and tedious process, computational approaches have been proposed as a possible alternative (Popat et al, 2017;Wang, 2017;Mihaylova et al, 2018), based on information sources such as social media (Ma et al, 2017), Wikipedia (Thorne et al, 2018), and knowledge bases (Huynh and Papotti, 2018). Fact-checking is a multi-step process (Vlachos and Riedel, 2014): (i) checking the reliability of media sources, (ii) retrieving potentially relevant documents from reliable sources as evidence for each target claim, (iii) predicting the stance of each document with respect to the target claim, and finally (iv) making a decision based on the stances from (iii) for all documents from (ii).…”

Section: Introductionmentioning

confidence: 99%

Contrastive Language Adaptation for Cross-Lingual Stance Detection

Mohtarami¹,

Glass²,

Nakov³

2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

We study cross-lingual stance detection, which aims to leverage labeled data in one language to identify the relative perspective (or stance) of a given document with respect to a claim in a different target language. In particular, we introduce a novel contrastive language adaptation approach applied to memory networks, which ensures accurate alignment of stances in the source and target languages, and can effectively deal with the challenge of limited labeled data in the target language. The evaluation results on public benchmark datasets and comparison against current state-of-the-art approaches demonstrate the effectiveness of our approach.

show abstract

FEVER: a Large-scale Dataset for Fact Extraction and VERification

Cited by 807 publications

References 16 publications

MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims

MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims

BERT for Evidence Retrieval and Claim Verification

Contrastive Language Adaptation for Cross-Lingual Stance Detection

Contact Info

Product

Resources

About