Comparison of Machine Learning and Sentiment Analysis in Detection of Suspicious Online Reviewers on Different Type of Data

Machová, Kristína; Mach, Marián; Vasilko, Matej

doi:10.3390/s22010155

Cited by 20 publications

(15 citation statements)

References 25 publications

(29 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For collecting datasets for this research, we used publicly available troll online reviewer dataset developed and created by Machova et al [ 46 ]. This dataset have been collected from Reddit platform, and it concerned with online political discussion.…”

Section: Methodsmentioning

confidence: 99%

Online Troll Reviewer Detection Using Deep Learning Techniques

Al-Adhaileh

Aldhyani

Alghamdi

2022

Applied Bionics and Biomechanics

View full text Add to dashboard Cite

The concentration of this paper is on detecting trolls among reviewers and users in online discussions and link distribution on social news aggregators such as Reddit. Trolls, a subset of suspicious reviewers, have been the focus of our attention. A troll reviewer is distinguished from an ordinary reviewer by the use of sentiment analysis and deep learning techniques to identify the sentiment of their troll posts. Machine learning and lexicon-based approaches can also be used for sentiment analysis. The novelty of the proposed system is that it applies a convolutional neural network integrated with a bidirectional long short-term memory (CNN–BiLSTM) model to detect troll reviewers in online discussions using a standard troll online reviewer dataset collected from the Reddit social media platform. Two experiments were carried out in our work: the first one was based on text data (sentiment analysis), and the second one was based on numerical data (10 attributes) extracted from the dataset. The CNN-BiLSTM model achieved 97% accuracy using text data and 100% accuracy using numerical data. While analyzing the results of our model, we observed that it provided better results than the compared methods.

show abstract

Section: Methodsmentioning

confidence: 99%

Online Troll Reviewer Detection Using Deep Learning Techniques

Al-Adhaileh

Aldhyani

Alghamdi

2022

Applied Bionics and Biomechanics

View full text Add to dashboard Cite

show abstract

“…Based on our previous works [ 9 , 13 ] we focused on the most successful methods of deep learning, namely convolutional as well as recurrent networks.…”

Section: Methodsmentioning

confidence: 99%

“…There are two different approaches to detecting disinformation. The first is using machine learning methods to train models for the identification of authors of disinformation [ 9 ] or to focus on toxicity in texts of conversational content, such as in the article [ 10 ] which analyzes hate speech using a web interface with focus on the most popular social networks such as Twitter, YouTube, and Facebook. Then, it is important to decide whether to use strong methods of deep learning or to use ensemble learning, which can work effectively even with weak classifiers.…”

Section: Introductionmentioning

confidence: 99%

Deep Learning in the Detection of Disinformation about COVID-19 in Online Space

Machová

Mach

Porezaný

2022

Sensors

Self Cite

View full text Add to dashboard Cite

This article focuses on the problem of detecting disinformation about COVID-19 in online discussions. As the Internet expands, so does the amount of content on it. In addition to content based on facts, a large amount of content is being manipulated, which negatively affects the whole society. This effect is currently compounded by the ongoing COVID-19 pandemic, which caused people to spend even more time online and to get more invested in this fake content. This work brings a brief overview of how toxic information looks like, how it is spread, and how to potentially prevent its dissemination by early recognition of disinformation using deep learning. We investigated the overall suitability of deep learning in solving problem of detection of disinformation in conversational content. We also provided a comparison of architecture based on convolutional and recurrent principles. We have trained three detection models based on three architectures using CNN (convolutional neural networks), LSTM (long short-term memory), and their combination. We have achieved the best results using LSTM (F1 = 0.8741, Accuracy = 0.8628). But the results of all three architectures were comparable, for example the CNN+LSTM architecture achieved F1 = 0.8672 and Accuracy = 0.852. The paper offers finding that introducing a convolutional component does not bring significant improvement. In comparison with our previous works, we noted that from all forms of antisocial posts, disinformation is the most difficult to recognize, since disinformation has no unique language, such as hate speech, toxic posts etc.

show abstract

“…The C4.5 algorithm is used in our work. The main limitation of decision trees is that they are prone to overfitting by creating overcomplicated models that view the feature of the training set as all data characteristics [ 43 ]. The random forest can avoid this problem.…”

Section: Other Da Techniques and Machine Learning Algorithmsmentioning

confidence: 99%

A Novel Data Augmentation Method for Improving the Accuracy of Insulator Health Diagnosis

Song

et al. 2022

Sensors

View full text Add to dashboard Cite

Performing ultrasonic nondestructive testing experiments on insulators and then using machine learning algorithms to classify and identify the signals is an important way to achieve an intelligent diagnosis of insulators. However, in most cases, we can obtain only a limited number of data from the experiments, which is insufficient to meet the requirements for training an effective classification and recognition model. In this paper, we start with an existing data augmentation method called DBA (for dynamic time warping barycenter averaging) and propose a new data enhancement method called AWDBA (adaptive weighting DBA). We first validated the proposed method by synthesizing new data from insulator sample datasets. The results show that the AWDBA proposed in this study has significant advantages relative to DBA in terms of data enhancement. Then, we used AWDBA and two other data augmentation methods to synthetically generate new data on the original dataset of insulators. Moreover, we compared the performance of different machine learning algorithms for insulator health diagnosis on the dataset with and without data augmentation. In the SVM algorithm especially, we propose a new parameter optimization method based on GA (genetic algorithm). The final results show that the use of the data augmentation method can significantly improve the accuracy of insulator defect identification.

show abstract

Comparison of Machine Learning and Sentiment Analysis in Detection of Suspicious Online Reviewers on Different Type of Data

Cited by 20 publications

References 25 publications

Online Troll Reviewer Detection Using Deep Learning Techniques

Online Troll Reviewer Detection Using Deep Learning Techniques

Deep Learning in the Detection of Disinformation about COVID-19 in Online Space

A Novel Data Augmentation Method for Improving the Accuracy of Insulator Health Diagnosis

Contact Info

Product

Resources

About