Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.669
|View full text |Cite
|
Sign up to set email alerts
|

CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

Abstract: We present CODEX, a set of knowledge graph COmpletion Datasets EXtracted from Wikidata and Wikipedia that improve upon existing knowledge graph completion benchmarks in scope and level of difficulty. In terms of scope, CODEX comprises three knowledge graphs varying in size and structure, multilingual descriptions of entities and relations, and tens of thousands of hard negative triples that are plausible but verified to be false. To characterize CODEX, we contribute thorough empirical analyses and benchmarking… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
34
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 59 publications
(47 citation statements)
references
References 48 publications
0
34
0
Order By: Relevance
“…By contrast, ComplEx is trained with binary cross-entropy loss, the same loss that we use to calibrate models in the validation stage. (Kadlec et al, 2017;Safavi and Koutra, 2020). We conclude that for relation prediction under the CWA, vector scaling provides the best trade-off between calibration, accuracy, and efficiency, as it consistently improves accuracy and calibration with only O(k) extra parameters.…”
Section: Resultsmentioning
confidence: 79%
See 4 more Smart Citations
“…By contrast, ComplEx is trained with binary cross-entropy loss, the same loss that we use to calibrate models in the validation stage. (Kadlec et al, 2017;Safavi and Koutra, 2020). We conclude that for relation prediction under the CWA, vector scaling provides the best trade-off between calibration, accuracy, and efficiency, as it consistently improves accuracy and calibration with only O(k) extra parameters.…”
Section: Resultsmentioning
confidence: 79%
“…By contrast, at evaluation time we make an open-world assumption. Higher-quality validation negatives may alleviate this problem; indeed, recent works have raised this issue and constructed new datasets toward this direction, albeit for the task of triple classification (Pezeshkpour et al, 2020;Safavi and Koutra, 2020).…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations