2023
DOI: 10.17485/ijst/v16i29.1500
|View full text |Cite
|
Sign up to set email alerts
|

A Novel Hybrid Feature Extraction Technique and Spam Review Detection using Ensemble Machine Learning Algorithm by Web Scrapping

Abstract: Objectives: To develop a novel hybrid method for feature generation and a novel dataset for experimenting and extracting the features for numerical representation. Methods: In the pursuit of the best spam review detection model, a four-stage process was undertaken. Initially, a dataset 'Fake reviews' was collected from Flipkart, containing 9926 samples from the home and kitchen products domain. Next, the data underwent pre-processing using the Natural Language Toolkit (NLTK) library. A novel Hybrid Feature Gen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 13 publications
0
0
0
Order By: Relevance
“…Goyal et al used three supervised machine learning methods: Gaussian Naïve Bayes (GNB), Multinomial Naïve Bayes, and Bernoulli Naïve Bayes to detect fake reviews. The GNB classifier outperforms other models in terms of accuracy and F1-score metric, as well as identifying deceptive reviews (7) . The authors used the NLTK library to clean up the review data, which is a predefined functionality, so the pre-processing methodology is not novel.…”
Section: Introductionmentioning
confidence: 94%
“…Goyal et al used three supervised machine learning methods: Gaussian Naïve Bayes (GNB), Multinomial Naïve Bayes, and Bernoulli Naïve Bayes to detect fake reviews. The GNB classifier outperforms other models in terms of accuracy and F1-score metric, as well as identifying deceptive reviews (7) . The authors used the NLTK library to clean up the review data, which is a predefined functionality, so the pre-processing methodology is not novel.…”
Section: Introductionmentioning
confidence: 94%