2017
DOI: 10.1002/spy2.9
|View full text |Cite
|
Sign up to set email alerts
|

Detecting opinion spams and fake news using text classification

Abstract: In recent years, deceptive content such as fake news and fake reviews, also known as opinion spams, have increasingly become a dangerous prospect for online users. Fake reviews have affected consumers and stores alike. Furthermore, the problem of fake news has gained attention in 2016, especially in the aftermath of the last U.S. presidential elections. Fake reviews and fake news are a closely related phenomenon as both consist of writing and spreading false information or beliefs. The opinion spam problem was… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
161
0
3

Year Published

2019
2019
2023
2023

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 322 publications
(166 citation statements)
references
References 20 publications
2
161
0
3
Order By: Relevance
“…User reviews are usually short text, and fake review detection is a binary classification problem [7]. The goal of this task is to determine whether a review is a fake review.…”
Section: Related Work a Identify Fake Reviews From The Perspectimentioning
confidence: 99%
“…User reviews are usually short text, and fake review detection is a binary classification problem [7]. The goal of this task is to determine whether a review is a fake review.…”
Section: Related Work a Identify Fake Reviews From The Perspectimentioning
confidence: 99%
“…The LIAR dataset consists of 12,836 manually labeled short statements from politifact.com ranked as barely true, false, half true, mostly true, or pants on fire [52] . Other well known datasets includes the ISOT dataset, which consists of 21,417 real news articles and 23,481 “fake news” articles [ 53 , 54 ] and a dataset with 1000 news articles, evenly split between fake and legitimate news [55] . 1 The presence of multiple datasets with misinformation content in multiple formats is useful, since the types of “fake news” experienced amidst the COVID-19 infodemic are broad and span multiple formats.…”
Section: Introductionmentioning
confidence: 99%
“…Some researchers evaluated how different feature extraction methods affect the results. Ahmed et al [3] compared 2 different features extraction techniques namely, term frequency (TF) and term frequency-inverted document frequency (TF-IDF) and 6 n-gram machine learning classification models including SGD, SVM, LSVM, LR, KNN, and DT on two datasets. They saw that an increase in the n-gram size would cause a decrease in the accuracy.…”
Section: Literature Reviewmentioning
confidence: 99%