2021
DOI: 10.1371/journal.pone.0259227
|View full text |Cite
|
Sign up to set email alerts
|

An oversampling method for multi-class imbalanced data based on composite weights

Abstract: To solve the oversampling problem of multi-class small samples and to improve their classification accuracy, we develop an oversampling method based on classification ranking and weight setting. The designed oversampling algorithm sorts the data within each class of dataset according to the distance from original data to the hyperplane. Furthermore, iterative sampling is performed within the class and inter-class sampling is adopted at the boundaries of adjacent classes according to the sampling weight compose… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
7
0
2

Year Published

2022
2022
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(10 citation statements)
references
References 40 publications
1
7
0
2
Order By: Relevance
“…The statistical level of significance was defined as α = 0.05 for all tests. In the implementation of the logistic regression the oversampling procedure was applied in some cases to improve the significance of imbalanced data [ 29 31 ]. Cases where oversampling was applied are marked with (by o.s.).…”
Section: Methodsmentioning
confidence: 99%
“…The statistical level of significance was defined as α = 0.05 for all tests. In the implementation of the logistic regression the oversampling procedure was applied in some cases to improve the significance of imbalanced data [ 29 31 ]. Cases where oversampling was applied are marked with (by o.s.).…”
Section: Methodsmentioning
confidence: 99%
“…Sorting the classes according to a hyperplane that depicts relative relationships among points concerning the influence of space surrounding each point can estimate whether it is a target or an outlier [42]. All these methods are strongly affected by many factors: the number of extracted classes and their belonging clusters [43], the local density estimation and the local reachability among connected points, boundaries that separate clusters [44], and local outliers [45].…”
Section: Related Workmentioning
confidence: 99%
“…Imbalanced can cause problems in the classification task because the model can overfit the majority class and under-fit the minority class [25]. To solve that problem, in this step, the re-sampling technique is applied [26]. The re-sampling technique that is applied is oversampling technique.…”
Section: Imbalance Datasetmentioning
confidence: 99%