2017
DOI: 10.1002/cpe.4281
|View full text |Cite
|
Sign up to set email alerts
|

A comparative study of the class imbalance problem in Twitter spam detection

Abstract: Recently, online social network (OSN) such as Twitter has become an important and popular source for real-time information and news dissemination, and Twitter is inevitably a prime target of spammers. It has been showed that the security threats caused by Twitter spam can reach far beyond the social media platform itself. To mitigate the damage caused by Twitter spam, machine learning classification algorithms have been employed by researchers and communities to detect the Twitter spam. However, most of these … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0
4

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 41 publications
(20 citation statements)
references
References 40 publications
(104 reference statements)
0
16
0
4
Order By: Relevance
“…Most of the work [17][18][19][20][21][22][23] done in removing class imbalance from data streams is based on ensemble-based techniques. Oversampling based Online Bagging (OOB) and Undersampling based Online Bagging (UOB) [17] can solve the online class imbalance problem for a two-class dataset and are capable of monitoring changes in the class imbalance and adapting to the changing class distribution without outside intervention.…”
Section: Class Imbalancementioning
confidence: 99%
See 1 more Smart Citation
“…Most of the work [17][18][19][20][21][22][23] done in removing class imbalance from data streams is based on ensemble-based techniques. Oversampling based Online Bagging (OOB) and Undersampling based Online Bagging (UOB) [17] can solve the online class imbalance problem for a two-class dataset and are capable of monitoring changes in the class imbalance and adapting to the changing class distribution without outside intervention.…”
Section: Class Imbalancementioning
confidence: 99%
“…Chaoliang et al [19] have done work on the class imbalance problem in the context of Twitter spam detection. Due to class imbalance, the spam detectors have not been able to perform up to their potential.…”
Section: Class Imbalancementioning
confidence: 99%
“…Features related to this phenomenon were utilised in training machine learning classifiers. Li and Liu [29] analysed how the effect of unbalance datasets can be mitigated in detection tasks.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Unbalanced datasets often affect the performance of detection systems [29], including ours. To mitigate that, we applied the SMOTE resampling technique [36] to balance the data by upscaling the minority class.…”
Section: Datasetmentioning
confidence: 99%
“…Ay Karakuş et al 7 introduced a binary classification study for the Turkish language on movie reviews. In addition, text categorization has multiple applications, such as web page classification, 8,9 spam detection, 10,11 author identification, 12,13 and customer relationship management. 14 TC is such a task to categorize unlabelled natural language texts into a predefined set of thematic classes based on their content.…”
Section: Introductionmentioning
confidence: 99%