2020
DOI: 10.1609/aaai.v34i04.6145
|View full text |Cite
|
Sign up to set email alerts
|

A Novel Model for Imbalanced Data Classification

Abstract: Recently, imbalanced data classification has received much attention due to its wide applications. In the literature, existing researches have attempted to improve the classification performance by considering various factors such as the imbalanced distribution, cost-sensitive learning, data space improvement, and ensemble learning. Nevertheless, most of the existing methods focus on only part of these main aspects/factors. In this work, we propose a novel imbalanced data classification model that considers al… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
8
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(8 citation statements)
references
References 20 publications
0
8
0
Order By: Relevance
“…Solutions to tackle the imbalance problem of classification can be broadly classified into four major families [9]: sampling methods (including oversampling and undersampling) [34,35], costsensitive learning [29], distance metric learning [36], and ensemble learning [37] and hybrid methods which integrate the features from different families such as Adacost [38], RUSBoost [39] and DDAE [40].…”
Section: Figure 2 Class Distribution On the Dataset With Ir=12mentioning
confidence: 99%
See 2 more Smart Citations
“…Solutions to tackle the imbalance problem of classification can be broadly classified into four major families [9]: sampling methods (including oversampling and undersampling) [34,35], costsensitive learning [29], distance metric learning [36], and ensemble learning [37] and hybrid methods which integrate the features from different families such as Adacost [38], RUSBoost [39] and DDAE [40].…”
Section: Figure 2 Class Distribution On the Dataset With Ir=12mentioning
confidence: 99%
“…This paper focuses on comparing the performance of these algorithms on multiple datasets from several different domains, including healthcare, card playing, software development projects and hand-written digit recognition. We use these datasets to evaluate ten imbalanced classification algorithms, namely 1) sampling: SMOTE [35] and MWMOTE [42]; 2) costsensitive learning: MetaCost [43], CAdaMEC [44] and cost-sensitive decision tree [9]; 3) distance metric learning: Iterative Metric Learning (IML) [36]; 4) ensemble learning and hybrid methods: AdaBoost [45], RUSBoost [39], self-paced Ensemble Classifier [11] and DDAE [40]. Our experiments not only analyze the performance of different models based on a general set of evaluation metrics on the same dataset, but also quantify the impact of key factors related to imbalanced learning, such as the size of the dataset and the imbalance ratio, as well as system performance in terms of learning time and memory usage.…”
Section: Figure 2 Class Distribution On the Dataset With Ir=12mentioning
confidence: 99%
See 1 more Smart Citation
“…Inspired by the remarkable achievements that deep learning has shown in a variety of domains, including computer vision [14] and natural language processing [15,16], it also has gained lots of attention for molecular property prediction. The molecular representation methods being introduced can be mainly summarized into two parts: sequencebased and graph-based approaches.…”
Section: Introductionmentioning
confidence: 99%
“…It automatically learns to extract features without professional knowledge for feature extraction. In manufacturing, the defect detection is a typical imbalanced data problem and defect samples are usually less than nondefective ones [15]. The imbalance ratio (IR) is usually used to describe the ratio of minority to majority samples.…”
Section: Introductionmentioning
confidence: 99%