2020
DOI: 10.48550/arxiv.2009.01571
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MixBoost: Synthetic Oversampling with Boosted Mixup for Handling Extreme Imbalance

Abstract: Training a classification model on a dataset where the instances of one class outnumber those of the other class is a challenging problem. Such imbalanced datasets are standard in real-world situations such as fraud detection, medical diagnosis, and computational advertising. We propose an iterative data augmentation method, MixBoost, which intelligently selects (Boost) and then combines (M ix) instances from the majority and minority classes to generate synthetic hybrid instances that have characteristics of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 22 publications
0
4
0
Order By: Relevance
“…), as data augmentation methods, have not only achieved notable success in a wide range of machine learning problems such as supervised learning [8], semi-supervised learning [54,55], adversarial learning [56], but also adapted to different data forms such as images [57], texts [58,59], graphs [60], and speech [61]. Notably, to alleviate the problem of class imbalance in the dataset, a series of methods [9,10,62] employ Mixup to augment the data. Despite this, there has not been any research on using MixUp to solve the class imbalance problem in hierarchical multi-label classification.…”
Section: Mixupmentioning
confidence: 99%
“…), as data augmentation methods, have not only achieved notable success in a wide range of machine learning problems such as supervised learning [8], semi-supervised learning [54,55], adversarial learning [56], but also adapted to different data forms such as images [57], texts [58,59], graphs [60], and speech [61]. Notably, to alleviate the problem of class imbalance in the dataset, a series of methods [9,10,62] employ Mixup to augment the data. Despite this, there has not been any research on using MixUp to solve the class imbalance problem in hierarchical multi-label classification.…”
Section: Mixupmentioning
confidence: 99%
“…but also adapted to different data forms such as images [5], texts [35,38], graphs [32], and speech [44]. Notably, to alleviate the problem of class imbalance in the dataset, a series of methods [6,7,9,16,20] employ Mixup to augment the data. Despite this, there has not been any research on using MixUp to solve the class imbalance problem in hierarchical multi-label classification…”
Section: Mixupmentioning
confidence: 99%
“…ReMix [6] mixes up by keeping the minority class label, instead of mixing up the labels. Similarly, MixBoost [14] attempts to combine active learning with MixUp to select which training samples to mix from each category, adding an extra complexity layer to the sampling process. Another popular technique that is related to our approach is SMOTE [5].…”
Section: Introductionmentioning
confidence: 99%