2019
DOI: 10.1002/asi.24210
|View full text |Cite
|
Sign up to set email alerts
|

Assessing the quality of information on wikipedia: A deep‐learning approach

Abstract: Currently, web document repositories have been collaboratively created and edited. One of these repositories, Wikipedia, is facing an important problem: assessing the quality of Wikipedia. Existing approaches exploit techniques such as statistical models or machine leaning algorithms to assess Wikipedia article quality. However, existing models do not provide satisfactory results. Furthermore, these models fail to adopt a comprehensive feature framework. In this article, we conduct an extensive survey of previ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 22 publications
(14 citation statements)
references
References 54 publications
0
14
0
Order By: Relevance
“…Contrary to what we expect, the CNN performs the worst. In most cases, the CNN has high performance in learning relevant features and ruling out irrelevant features [54,62]. Moreover, after comparison of basic LSTM and CNN-LSTM, we find that the CNN degrades the model performance.…”
Section: Methodsmentioning
confidence: 97%
See 2 more Smart Citations
“…Contrary to what we expect, the CNN performs the worst. In most cases, the CNN has high performance in learning relevant features and ruling out irrelevant features [54,62]. Moreover, after comparison of basic LSTM and CNN-LSTM, we find that the CNN degrades the model performance.…”
Section: Methodsmentioning
confidence: 97%
“…Few studies have analysed and summarised the existing work. In this section, we perform an extensive review of the existing feature frameworks [1,2,5,6,12,23–27,35,42,44,4754] and propose a comprehensive feature framework as a representation of Wikipedia articles. Text statistics are indicators that measure basic article statistics [1,23], including word count and character count.…”
Section: Representation Of Wikipedia Articlesmentioning
confidence: 99%
See 1 more Smart Citation
“…Importantly, such mapping should consider disciplinary differences in citations from Wikipedia, as well as books (5.3 million citations by our estimates) and nonscientific sources such as news outlets and other online media (21.5 million citations), which make up the largest share of Wikipedia citations. Answering these questions is critical to inform the community work on improving Wikipedia by finding and filling knowledge gaps and biases, at the same time guaranteeing the quality and diversity of the sources Wikipedia relies upon (Hube, 2017;Mesgari et al, 2015;Piscopo, Kaffee et al, 2017;Piscopo & Simperl, 2019;Wang & Li, 2020). Link prediction in general, and citation recommendation in particular, have been explored for Wikipedia for some time (Fetahu et al, 2016;Paranjape, West et al, 2016;Wulczyn, West et al, 2016).…”
Section: Map Of Wikipedia Sourcesmentioning
confidence: 99%
“…To solve that problem, machine learning algorithms have been applied to Wikipedia article quality assessment. They combined machine learning models with hand‐crafted features to assess the quality of Wikipedia articles (Shen et al, 2017; Zhang et al, 2018; Ferschke et al, 2012; Khairova et al, 2017; Wang & Li, 2020; Wang et al, 2019). However, semantic features from the article content are often ignored in these approaches.…”
Section: Related Workmentioning
confidence: 99%