Big Data and quality data for fake news and misinformation detection

Asr, Fatemeh Torabi; Taboada, Maïté

doi:10.1177/2053951719843310

Cited by 83 publications

(59 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Algorithms could be analysed through laboratory testing and reverse engineering ( Diakopoulos, 2015 ), which means reconstructing the algorithm to identify the functional principles. Additionally, more cooperation with independent researchers would be helpful in order to research the dissemination of misinformation ( Lazer et al, 2018 ; Torabi Asr & Taboada, 2019 ).…”

Section: Concepts Of Digital Media Ethics and Responsibilitymentioning

confidence: 99%

The responsibility of social media in times of societal and political manipulation

Reisach

2021

European Journal of Operational Research

View full text Add to dashboard Cite

Highlights Social media's profiling can be used for advertising and political manipulation. Widely disseminated disinformation endangers democratic decision-making. Social media platforms show a lack of long-term societal focus on potential risks. Societies should ensure that globally active social media companies act responsibly. Responsibly governed social media could help emancipate societies.

show abstract

Section: Concepts Of Digital Media Ethics and Responsibilitymentioning

confidence: 99%

The responsibility of social media in times of societal and political manipulation

Reisach

2021

European Journal of Operational Research

View full text Add to dashboard Cite

show abstract

“…The size of the dataset plays an important role in ensuring a high accuracy of the fake detection process. In particular, if the dataset is used to train a fake news detection method that is based on machine learning, it is fundamental to have a large dataset because the performance of this kind of method improves as the training dataset size increases ( Torabi & Taboada, 2019 ). The negative aspect is that very large datasets are less reliable using manual annotation due to time consumption and misclassification ( Ghiassi & Lee, 2018 ).…”

Section: Survey Methodologymentioning

confidence: 99%

“…A big dataset is fundamental for achieving a highly accurate fake detection process, mainly for fake news detection methods based on deep neural network models, which need a large dataset because their performance improves as the training dataset size increases. Torabi & Taboada (2019) discussed the necessity to use big data for fake news detection and encouraged researchers in this field to share their datasets and to work together towards a standardized large-scale fake news benchmark dataset.…”

Section: Survey Methodologymentioning

confidence: 99%

Fake news detection: a survey of evaluation datasets

D’Ulizia¹,

Caschera²,

Ferri³

et al. 2021

PeerJ Computer Science

View full text Add to dashboard Cite

Fake news detection has gained increasing importance among the research community due to the widespread diffusion of fake news through media platforms. Many dataset have been released in the last few years, aiming to assess the performance of fake news detection methods. In this survey, we systematically review twenty-seven popular datasets for fake news detection by providing insights into the characteristics of each dataset and comparative analysis among them. A fake news detection datasets characterization composed of eleven characteristics extracted from the surveyed datasets is provided, along with a set of requirements for comparing and building new datasets. Due to the ongoing interest in this research topic, the results of the analysis are valuable to many researchers to guide the selection or definition of suitable datasets for evaluating their fake news detection methods.

show abstract

“…FakeNewsNet contains two comprehensive datasets with diverse features in news content, social context, and spatiotemporal information. Asr et al [33] reviewed the available misinformation detection datasets and introduced the "MisInfoText" repository to address the lack of datasets with reliable labels. MisInfoText repository contains three data categories: links to all the publicly available textual fake news datasets, features to collect data directly from fact-checking websites, and datasets originally published in [30] In summary, many existing works focus on building misleading-information detection systems.…”

Section: Related Workmentioning

confidence: 99%

Detecting Misleading Information on COVID-19

2020

View full text Add to dashboard Cite

This paper addresses the problem of detecting misleading information related to COVID-19. We propose a misleading-information detection model that relies on the World Health Organization, UNICEF, and the United Nations as sources of information, as well as epidemiological material collected from a range of fact-checking websites. Obtaining data from reliable sources should assure their validity. We use this collected ground-truth data to build a detection system that uses machine learning to identify misleading information. Ten machine learning algorithms, with seven feature extraction techniques, are used to construct a voting ensemble machine learning classifier. We perform 5-fold cross-validation to check the validity of the collected data and report the evaluation of twelve performance metrics. The evaluation results indicate the quality and validity of the collected ground-truth data and their effectiveness in constructing models to detect misleading information.

show abstract

Big Data and quality data for fake news and misinformation detection

Cited by 83 publications

References 37 publications

The responsibility of social media in times of societal and political manipulation

The responsibility of social media in times of societal and political manipulation

Fake news detection: a survey of evaluation datasets

Detecting Misleading Information on COVID-19

Contact Info

Product

Resources

About