2013
DOI: 10.9790/0661-1266673
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Cluster Based Approach for skewed Data in Data Mining

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0
1

Year Published

2015
2015
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 15 publications
(11 citation statements)
references
References 17 publications
0
10
0
1
Order By: Relevance
“…Finally, it combines the selected negative instances from the K clusters with all the instances in the minority class. A similar under-sampling method corresponds to the algorithm proposed by Longadge et al [36], which firstly clusters the majority class instances into K groups using the K-means algorithm and then selects |C + | × IR i majority class instances from each cluster i, where IR i denotes the imbalance ratio in the cluster i. Note that the aim of this method is not to obtain a perfectly balanced class distribution, but to reduce the disproportion between the size of the majority and minority classes.…”
Section: Clustering-based Algorithmsmentioning
confidence: 99%
See 1 more Smart Citation
“…Finally, it combines the selected negative instances from the K clusters with all the instances in the minority class. A similar under-sampling method corresponds to the algorithm proposed by Longadge et al [36], which firstly clusters the majority class instances into K groups using the K-means algorithm and then selects |C + | × IR i majority class instances from each cluster i, where IR i denotes the imbalance ratio in the cluster i. Note that the aim of this method is not to obtain a perfectly balanced class distribution, but to reduce the disproportion between the size of the majority and minority classes.…”
Section: Clustering-based Algorithmsmentioning
confidence: 99%
“…Regarding the differences between DBMIST-US and the existing clustering-based methods, it is worth pointing out that most under-sampling techniques rely upon the K-means and the fuzzy K-means algorithms [16,26,[36][37][38]40,41,43,51]. However, it is well-known that K-means may not be sufficiently effective when applied to imbalanced data because it always generates clusters with similar sizes [52].…”
Section: Differences Between Dbmist-us and Related Workmentioning
confidence: 99%
“…Several studies reported the difficulty of clustering or classification of the Yeast data set. As Longadge et al (2013) reported, classification of the Yeast data set was done by several classification methods such as K-NN. The K-NN ( K = 3) was able to classify the Yeast data set with 0.11% accuracy by F -measure after several epochs and times running the method.…”
Section: Data Sets From Uci Repositorymentioning
confidence: 99%
“…The K-NN (K=3) was able to classify the Yeast dataset with 0.11% accuracy by F-measure after several epochs and times running the method. Also, Ahirwar [69] reported the K-means was able to classify the Yeast dataset with 65.00% accuracy by F-measure after several epochs.…”
Section: Yeast Datasetmentioning
confidence: 99%