2021
DOI: 10.1109/access.2021.3071389
|View full text |Cite
|
Sign up to set email alerts
|

Deep Generative Models to Counter Class Imbalance: A Model-Metric Mapping With Proportion Calibration Methodology

Abstract: The most pervasive segment of techniques in managing class imbalance in machine learning are re-sampling-based methods. The emergence of deep generative models for augmenting the size of the under-represented class, prompts one to review the question of the suitability of the model chosen for data augmentation with the metric selected for the-goodness-of classification. This work defines this suitability by using newly-sampled data points from each generative model first to the degree of parity, and studying c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 50 publications
0
5
0
Order By: Relevance
“…The core idea behind generative modeling in the context of tackling the class imbalance problem is to estimate the probability density function describing the data and generate new data instances in a random fashion [39] in order to balance the data distribution in an otherwise imbalanced dataset. Generative models typically construct a latent space that aims to capture the direct cause of the target variable.…”
Section: ) Generative Modelingmentioning
confidence: 99%
“…The core idea behind generative modeling in the context of tackling the class imbalance problem is to estimate the probability density function describing the data and generate new data instances in a random fashion [39] in order to balance the data distribution in an otherwise imbalanced dataset. Generative models typically construct a latent space that aims to capture the direct cause of the target variable.…”
Section: ) Generative Modelingmentioning
confidence: 99%
“…Where most research has focused on modifying GAN architecture to achieve optimal results in class imbalanced settings, Mizra et al [79] posed a distinct yet equally important question: given optimized performance on a desired evaluation metric, what data augmentation method and proportion of synthetic sample injection should be used? The resulting framework, termed Model-Metric Mapper methodology, or MMM, can conversely offer a procedural and hierarchical approach to guide the practitioner toward proper model selection based on desired evaluation metric.…”
Section: Other Disciplinesmentioning
confidence: 99%
“…Borderline-SMOTE and ADAYSN are both frequently cited as baseline methods, and consequently both [20] and [14] possess a high in-degree count. The use of conditional GANs for data generation based on class labels is addressed in [52,79], and [84] (all with five or six in-degrees). With five in-network citations, [85] offers advice vis-à-vis best practices for hyper-tuning GANs at time of publication, though this work is mainly done with computer vision tasks in mind.…”
Section: Citation Network Analysismentioning
confidence: 99%
“…The author identifies seven vital areas of research in this topic, covering the full spectrum of learning from imbalanced data such as classification, regression, clustering, data streams, big data analytics and applications. Fanny et al [11], Ming et al [12], Zhai et al [13], and Mirza et al [14] propose different deep learning approaches to address class imbalance, Fanny et al [11] proposed a method based on Class Expert Generative Adversarial Network (CE-GAN). In this approach, a GAN is trained for each minority class, with the generator network being conditioned on the class label.…”
Section: Introductionmentioning
confidence: 99%
“…This approach improves the performance of minority classes by increasing the diversity of the training data. Mirza et al [14] proposed deep generative models to counter class imbalance. They proposed two approaches: the first approach involves using a variational autoencoder to generate synthetic samples for the minority class, while the second approach involves using a generative adversarial network to generate synthetic samples.…”
Section: Introductionmentioning
confidence: 99%