2014
DOI: 10.1007/s10115-014-0794-3
|View full text |Cite
|
Sign up to set email alerts
|

Class imbalance revisited: a new experimental setup to assess the performance of treatment methods

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
77
0
3

Year Published

2016
2016
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 154 publications
(81 citation statements)
references
References 27 publications
1
77
0
3
Order By: Relevance
“…A more global review on learning from skewed data was proposed by Branco [5] and concentrates on a more general issue of imbalanced predictive modeling. Among more specialized discussions on this topic a thorough survey on ensemble learning by Galar et al [17], an indepth insight into imbalanced data characteristics by López et al [36] and discussion on new perspectives for evaluation classifiers on skewed datasets [42] deserve mentioning.…”
Section: Introductionmentioning
confidence: 99%
“…A more global review on learning from skewed data was proposed by Branco [5] and concentrates on a more general issue of imbalanced predictive modeling. Among more specialized discussions on this topic a thorough survey on ensemble learning by Galar et al [17], an indepth insight into imbalanced data characteristics by López et al [36] and discussion on new perspectives for evaluation classifiers on skewed datasets [42] deserve mentioning.…”
Section: Introductionmentioning
confidence: 99%
“…Thus, since the class distribution does harm the learning processes as it extremely diverges from the balanced one [27], it is immediate to use a distance/similarity function, d (⇣, e), between both the empirical and balanced distributions, ⇣ and e, to summarise the degree of skewness of a classification problem K . Here, stands for any chosen distance between vectors or divergence between probability distributions which can be found in the literature.…”
Section: Imbalance-degreementioning
confidence: 99%
“…There, each row corresponds to a dataset and each column stands for a characteristic (name, features and number of classes) or a summary (empirical class distribution, number of occurrences, IR and IDs). Afterwards, each dataset is used to feed a representative learning algorithm from the traditional major learning paradigms [27]. Specifically, for each problem, a di↵erent classifier is learnt using 5 di↵erent popular supervised algorithms 4 : C4.5 (Decision trees), RIPPER (Decision rules), Neural Networks (Connectionism), Naïve Bayes (Probabilistic), and SVM (Statistical learning).…”
Section: Study 2: Sensitivity and Validity Of Imbalance-degreementioning
confidence: 99%
See 2 more Smart Citations