Effective Rate of Minority Class Over-Sampling for Maximizing the Imbalanced Dataset Model Performance

Naim, Forhad An; Hannan, Ummae Hamida; Kabir, Humayun

doi:10.1007/978-981-16-6285-0_2

Cited by 5 publications

(2 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the literature [11], the authors use an oversampling approach to replicate randomly selected samples from a small number, which reduces both inter-category and intra-category imbalances, but such an approach can lead to the occurrence of overfitting. In the literature [12], the authors use the undersampling method to remove randomly from the majority of classes until all classes have the same number of samples, but the significant disadvantage of this is that it discards a portion of the available data. Unlike the data level, the algorithm level is to modify the training algorithm or the network structure.…”

Section: Class Imbalance Machine Learningmentioning

confidence: 99%

Addressing Class Variable Imbalance in Federated Semi-Supervised Learning

Zehui¹,

Liu²,

Liu³

et al. 2023

Computer Science, Engineering and Applications

View full text Add to dashboard Cite

Federated Semi-supervised Learning (FSSL) combines techniques from both fields of federated and semi- supervised learning to improve the accuracy and performance of models in a distributed environment by using a small fraction of labeled data and a large amount of unlabeled data. Without the need tocentralize all data in one place for training, it collect updates of model training after devices train models at local, and thus can protect the privacy of user data. However, during the federal training process, some of the devices fail to collect enough data for local training, while new devices will be included to the group training. This leads to an unbalanced global data distribution and thus af ect the performance of the global model training. Most of the current research is focusing on class imbalance with a fixednumber of classes, while little attention is paid to data imbalance with a variable number of classes. Therefore, in this paper, we propose Federated Semi-supervised Learning for Class Variable Imbalance (FCVI) to solve class variable imbalance. The class-variable learning algorithm is used to mitigate the data imbalance due to changes of the number of classes. Our scheme is proved to be significantly better than baseline methods, while maintaining client privacy.

show abstract

Section: Class Imbalance Machine Learningmentioning

confidence: 99%

Addressing Class Variable Imbalance in Federated Semi-Supervised Learning

Zehui¹,

Liu²,

Liu³

et al. 2023

Computer Science, Engineering and Applications

View full text Add to dashboard Cite

show abstract

“…7, SVM model in the F1 score, accuracy, and precision metrics have provided the highest values, and the RF model in AUC and sensitivity has given the best amounts. We prefer the F1 score over other metrics because it is effective for imbalanced datasets [91], and our test data consists of 76% failed and 24% Not-failed conditions, which is an imbalanced dataset. After F1 score, AUC with the highest value represented the best predictive model [92].…”

Section: Evaluating the Best-trained Modelsmentioning

confidence: 99%