2021
DOI: 10.1145/3484828
|View full text |Cite
|
Sign up to set email alerts
|

Assessing the Quality of Sources in Wikidata Across Languages: A Hybrid Approach

Abstract: Wikidata is one of the most important sources of structured data on the web, built by a worldwide community of volunteers. As a secondary source, its contents must be backed by credible references; this is particularly important, as Wikidata explicitly encourages editors to add claims for which there is no broad consensus, as long as they are corroborated by references. Nevertheless, despite this essential link between content and references, Wikidata's ability to systematically assess and assure the quality o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 11 publications
(12 citation statements)
references
References 53 publications
(46 reference statements)
0
4
0
Order By: Relevance
“…Wikidata is a secondary source for data and so, though it is rapidly growing, it will never be complete. This means that some level of inconsistency and incompleteness in its contents is currently inevitable [8,[29][30][31]. There is thorough coverage of some items, such as protein classes [32], human genes [14], cell types [26,33], and metabolic pathways [34].…”
Section: Tip 10: Mind the Gaps: What Data Is Currently Missing?mentioning
confidence: 99%
“…Wikidata is a secondary source for data and so, though it is rapidly growing, it will never be complete. This means that some level of inconsistency and incompleteness in its contents is currently inevitable [8,[29][30][31]. There is thorough coverage of some items, such as protein classes [32], human genes [14], cell types [26,33], and metabolic pathways [34].…”
Section: Tip 10: Mind the Gaps: What Data Is Currently Missing?mentioning
confidence: 99%
“…While reviewing the literature, it was found that the majority of existing research focuses on either technological aspects or ontological aspects of the platform. More Specifically, review of the literature illustrated that researchers use Wikidata to conduct new types of research (Amaral et al, 2021;Colla et al, 2021;Ferradji & Benchikha, 2021;Good et al, 2016;Kaffee, 2016;Konieczny & Klein, 2018;Lemus-Rojas & Odell, 2018;Li et al, 2022;Meier, 2022;Mietchen et al, 2015;Morshed, 2021;Neelam et al, 2022;Rasberry & Mietchen, 2021;Shenoy et al, 2022;Taveekarn et al, 2019;Waagmeester et al, 2020Waagmeester et al, , 2021Zhang et al, 2022). Researchers also use Wikidata to conduct new types of academic analysis in a variety of disciplines (Arnaout et al, 2021;Burgstaller-Muehlbacher et al, 2016;Kaffee et al, 2017;Klein et al, 2016;Lemus-Rojas, n.d.;Pfundner et al, 2015;Putman et al, 2017;Rutz et al, 2021;Scharpf et al, 2021a, b;Turki et al, 2019Turki et al, , 2022a.…”
Section: Wikidata Users and Early Adoptersmentioning
confidence: 99%
“…The classification task explores single-class (binary) and a novel multi-class (stacking) methods. Several stream-based binary classification algorithms were selected according to performance in similar problems [17,30,31] and availability in scikit-multiflow 16 , the adopted ml package for streaming data. They include single and ensemble methods:…”
Section: Classificationmentioning
confidence: 99%