2022
DOI: 10.48129/kjs.splml.19119
|View full text |Cite
|
Sign up to set email alerts
|

Oversampling based on generative adversarial networks to overcome imbalance data in predicting fraud insurance claim

Abstract: Fraud on health insurance impacts cost overruns and a quality decline in health services in the long term. The use of machine learning to detect fraud on health insurance is increasingly popular. However, one challenge in predicting health insurance fraud is the data imbalance. The data imbalance can cause a bias towards the majority class in many machine learning methods. Oversampling is a solution for data imbalance by augmenting new data based on the existing minority class data. Recently, there has been gr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 18 publications
0
3
0
Order By: Relevance
“…This study classified the levels of cognitive impairment associated with Parkinson's disease by applying oversampling techniques to three datasets with three different IR values and found that GAN-based oversampling techniques showed better AUC and F1-score values than traditional techniques. Nugraha et al [37] used insurance fraud imbalance data and proposed CTGAN as an oversampling method, showing that over the application of 17 classification models, CTGAN presented a better performance (AUC, F1-score, precision, etc.) than ROS, SMOTE, and ADASYN.…”
Section: Discussionmentioning
confidence: 99%
“…This study classified the levels of cognitive impairment associated with Parkinson's disease by applying oversampling techniques to three datasets with three different IR values and found that GAN-based oversampling techniques showed better AUC and F1-score values than traditional techniques. Nugraha et al [37] used insurance fraud imbalance data and proposed CTGAN as an oversampling method, showing that over the application of 17 classification models, CTGAN presented a better performance (AUC, F1-score, precision, etc.) than ROS, SMOTE, and ADASYN.…”
Section: Discussionmentioning
confidence: 99%
“…Real-world data, such as data related to fault detection [3], [4]; fraud detection [5], [6], and medical diagnosis [7]- [9], often have data imbalance problems. A dataset is called an imbalance if it does not represent the classified categories evenly [10].…”
Section: Introductionmentioning
confidence: 99%
“…GAN generates additional data for minority classes by oversampling with the Conditional Tabular GAN (CTGAN) architecture. The generator adjusts the tabular data input and receives supplementary information to produce samples under the specified class conditions [6]. The experimental results show that the proposed method performs better than other oversampling methods on several evaluation metrics: Accuracy, Precision score, F1 score, and AUC.…”
Section: Introductionmentioning
confidence: 99%