Detecting opinion spams and fake news using text classification

Ahmed, Hadeer; Traoré, Issa; Saad, Sherif

doi:10.1002/spy2.9

Cited by 322 publications

(166 citation statements)

References 20 publications

Supporting

Mentioning

161

Contrasting

Unclassified

Order By: Relevance

“…User reviews are usually short text, and fake review detection is a binary classification problem [7]. The goal of this task is to determine whether a review is a fake review.…”

Section: Related Work a Identify Fake Reviews From The Perspectimentioning

confidence: 99%

Fake Review Detection Based on Multiple Feature Fusion and Rolling Collaborative Training

et al. 2020

View full text Add to dashboard Cite

Fake reviews may mislead consumers. A large number of fake reviews will even cause huge property losses and public opinion crises. Therefore, it is necessary to detect and filter fake reviews. However, most existing methods have lower accuracy in detecting fake reviews due to they just use single features and lack of labeled experimental data. To solve this problem, we propose a novelty method to detect fake reviews based on multiple feature fusion and rolling collaborative training. First, the method requires an initial index system with multiple features such as text features, sentiment features of reviews and behavior features of reviewers. Second, the method needs an initial training sample set. Thus, we designed related algorithms to extract all the features of a review. Then the classification of the review is labeled manually. Finally, the method uses the initial sample set to train 7 classifiers, and the most accurate classifier will be selected to classify new reviews. The novelty of the method lies in that the features and the classification labels of the new reviews will be added into the initial sample set as new samples. So the size of the sample set will increase automatically. The experimental results in the reviews of yelp shopping website show that the accuracy of the proposed method for detecting fake reviews is 84.45%, which is 3.5% higher than the baseline methods. And compared with the latest deep learning model, its baseline precision has increased by 5.3%. According to the Friedman test, the support vector machine (SVM) classifier and random forest (RF) classifier has been proven to be the best one by statistical means. It means our method which uses multiple features has higher accuracy than the baseline models. Meanwhile, it also resolves the problem of lacking labeled training samples in fake reviews detection. INDEX TERMS Fake review detection, machine learning, multiple feature fusion, feature extraction, rolling collaborative training

show abstract

“…User reviews are usually short text, and fake review detection is a binary classification problem [7]. The goal of this task is to determine whether a review is a fake review.…”

Section: Related Work a Identify Fake Reviews From The Perspectimentioning

confidence: 99%

Fake Review Detection Based on Multiple Feature Fusion and Rolling Collaborative Training

et al. 2020

View full text Add to dashboard Cite

show abstract

“…The LIAR dataset consists of 12,836 manually labeled short statements from politifact.com ranked as barely true, false, half true, mostly true, or pants on fire [52] . Other well known datasets includes the ISOT dataset, which consists of 21,417 real news articles and 23,481 “fake news” articles [ 53 , 54 ] and a dataset with 1000 news articles, evenly split between fake and legitimate news [55] . 1 The presence of multiple datasets with misinformation content in multiple formats is useful, since the types of “fake news” experienced amidst the COVID-19 infodemic are broad and span multiple formats.…”

Section: Introductionmentioning

confidence: 99%

CoVerifi: A COVID-19 news verification system

Kolluri

Murthy

2021

Online Social Networks and Media

View full text Add to dashboard Cite

“…Some researchers evaluated how different feature extraction methods affect the results. Ahmed et al [3] compared 2 different features extraction techniques namely, term frequency (TF) and term frequency-inverted document frequency (TF-IDF) and 6 n-gram machine learning classification models including SGD, SVM, LSVM, LR, KNN, and DT on two datasets. They saw that an increase in the n-gram size would cause a decrease in the accuracy.…”

Section: Literature Reviewmentioning

confidence: 99%

A Novel Stacking Approach for Accurate Detection of Fake News

et al. 2021

View full text Add to dashboard Cite

With the increasing popularity of social media, people has changed the way they access news. News online has become the major source of information for people. However, much information appearing on the Internet is dubious and even intended to mislead. Some fake news are so similar to the real ones that it is difficult for human to identify them. Therefore, automated fake news detection tools like machine learning and deep learning models have become an essential requirement. In this paper, we evaluated the performance of five machine learning models and three deep learning models on two fake and real news datasets of different size with hold out cross validation. We also used term frequency, term frequencyinverse document frequency and embedding techniques to obtain text representation for machine learning and deep learning models respectively. To evaluate models' performance, we used accuracy, precision, recall and F1-score as the evaluation metrics and a corrected version of McNemar's test to determine if models' performance is significantly different. Then, we proposed our novel stacking model which achieved testing accuracy of 99.94% and 96.05 % respectively on the ISOT dataset and KDnugget dataset. Furthermore, the performance of our proposed method is high as compared to baseline methods. Thus, we highly recommend it for fake news detection.

show abstract

Detecting opinion spams and fake news using text classification

Cited by 322 publications

References 20 publications

Fake Review Detection Based on Multiple Feature Fusion and Rolling Collaborative Training

Fake Review Detection Based on Multiple Feature Fusion and Rolling Collaborative Training

CoVerifi: A COVID-19 news verification system

A Novel Stacking Approach for Accurate Detection of Fake News

Contact Info

Product

Resources

About