Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval 2021
DOI: 10.1145/3404835.3463253
|View full text |Cite
|
Sign up to set email alerts
|

Wiki-Reliability: A Large Scale Dataset for Content Reliability on Wikipedia

Abstract: Wikipedia is the largest online encyclopedia, used by algorithms and web users as a central hub of reliable information on the web. The quality and reliability of Wikipedia content is maintained by a community of volunteer editors. Machine learning and information retrieval algorithms could help scale up editors' manual efforts around Wikipedia content reliability. However, there is a lack of large-scale data to support the development of such research. To fill this gap, in this paper, we propose Wiki-Reliabil… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 11 publications
(16 reference statements)
0
4
0
Order By: Relevance
“…Previous studies on quality estimation in Wikipedia have mainly focused on the article-level (Mola-Velasco 2011; Bykau et al 2015;Wong, Redi, and Saez-Trumper 2021;Asthana et al 2021). They are aimed at estimating the quality of revisions and articles.…”
Section: Related Workmentioning
confidence: 99%
“…Previous studies on quality estimation in Wikipedia have mainly focused on the article-level (Mola-Velasco 2011; Bykau et al 2015;Wong, Redi, and Saez-Trumper 2021;Asthana et al 2021). They are aimed at estimating the quality of revisions and articles.…”
Section: Related Workmentioning
confidence: 99%
“…According to the literature, the classification of wiki edits encompasses the detection of paid [20], puffery [21], reverted [22], [23], [24], toxic [25], [26] and vandal [9], [10], [12], [13], [17], [27], [28], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38] reviews. Similarly, prediction focuses on review quality [39], [40], [41] as well as on editor and article quality [18], [42], [43], [44], [45].…”
Section: B Analysis Of Reviewsmentioning
confidence: 99%
“…In the case of article drafts, ORES returns the probability of being spam, vandalism, an attack, and OK. These scores are used as input features by many of the surveyed works, e.g., [9], [34], [36], [37], [41], to classify reviews. Moreover, ORES is currently used on wiki platforms to help volunteers reduce the burden of manually screening content.…”
Section: ) Vandalism Detectionmentioning
confidence: 99%
See 1 more Smart Citation