Building prediction models and discovering important factors of health insurance fraud using machine learning methods

Nalluri, Venkateswarlu; Chang, Jing-Rong; Chen, Long-Sheng; Chen, Jiachuan

doi:10.1007/s12652-023-04633-6

Cited by 6 publications

(2 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The model helps in differentiating the fraudulent and legitimate clicks, thereby finding the fraudulent users among the legitimate ones. The authors of [22] employed two unpublished datasets that might unravel novel knowledge, and four machine learning methods, including Support Vector Machines (SVM), Decision Trees (DT), Random Forest (RF), and Multilayer Perceptron (MLP) to determine the ML models used for the detection of medical fraud.…”

Section: Related Workmentioning

confidence: 99%

Sustainable Financial Fraud Detection Using Garra Rufa Fish Optimization Algorithm with Ensemble Deep Learning

Maashi,

Alabduallah,

Kouki

2023

Sustainability

View full text Add to dashboard Cite

Sustainable financial fraud detection (FD) comprises the use of sustainable and ethical practices in the detection of fraudulent activities in the financial sector. Credit card fraud (CCF) has dramatically increased with the advances in communication technology and e-commerce systems. Recently, deep learning (DL) and machine learning (ML) algorithms have been employed in CCF detection due to their features’ capability of building a powerful tool to find fraudulent transactions. With this motivation, this article focuses on designing an intelligent credit card fraud detection and classification system using the Garra Rufa Fish optimization algorithm with an ensemble-learning (CCFDC-GRFOEL) model. The CCFDC-GRFOEL model determines the presence of fraudulent and non-fraudulent credit card transactions via feature subset selection and an ensemble-learning process. To achieve this, the presented CCFDC-GRFOEL method derives a new GRFO-based feature subset selection (GRFO-FSS) approach for selecting a set of features. An ensemble-learning process, comprising an extreme learning machine (ELM), bidirectional long short-term memory (BiLSTM), and autoencoder (AE), is used for the detection of fraud transactions. Finally, the pelican optimization algorithm (POA) is used for parameter tuning of the three classifiers. The design of the GRFO-based feature selection and POA-based hyperparameter tuning of the ensemble models demonstrates the novelty of the work. The simulation results of the CCFDC-GRFOEL technique are tested on the credit card transaction dataset from the Kaggle repository and the results demonstrate the superiority of the CCFDC-GRFOEL technique over other existing approaches.

show abstract

Section: Related Workmentioning

confidence: 99%

Sustainable Financial Fraud Detection Using Garra Rufa Fish Optimization Algorithm with Ensemble Deep Learning

Maashi,

Alabduallah,

Kouki

2023

Sustainability

View full text Add to dashboard Cite

show abstract

“…Nalluri et al [8] employed a set of ML algorithms, including Support Vector Machines (SVM), Decision Trees (DT), Random Forest (RF), and Multilayer Perceptron (MLP), to address the critical issue of medical insurance fraud. The primary goal of the study was to identify the most effective machine learning method for this task.…”

Section: Related Workmentioning

confidence: 99%

Detection of Health Insurance Fraud using Bayesian Optimized XGBoost

Parthasarathy,

Raj Lakshminarayanan,

Abdul Azeez Khan

et al. 2023

IJSSE

View full text Add to dashboard Cite

The mounting prevalence of health insurance fraud, propelled by a myriad of socioeconomic factors, presents significant hurdles to insurers, healthcare institutions, and individuals. In an attempt to counter this, insurance companies have begun harnessing the power of advanced technology, utilizing Machine Learning models to distinguish legitimate from fraudulent claims within expansive datasets. The present study conducts an in-depth examination of a health insurance dataset comprising 517,737 records, employing the Extreme Gradient Boosting (XGBoost) model as a potent tool for the detection of deceptive claims. In a noteworthy development, the performance of the model is markedly amplified through the integration of Bayesian optimization techniques, culminating in the Bayesian Optimized XGBoost (BOXGBoost) Model. The BOXGBoost Model is meticulously evaluated against an array of algorithms, which include Naive Bayes, Logistic Regression, Random Forest, K-Nearest Neighbor, and AdaBoost. A comparative analysis, focusing on key performance metrics such as accuracy, precision, recall, F1-Score, and the Area Under the Curve (AUC), is undertaken to discern the most effective algorithm. Remarkably, the proposed BOXGBoost model emerges as the superior performer, achieving an impressive accuracy rate of 98% and an AUC of 0.994. Additionally, the model exhibits high precision (98%), recall (97%), and F1-Score (97.5%), highlighting its exceptional capability in the prediction of health insurance fraud.

show abstract

Enhancing Medicare Fraud Detection Through Machine Learning: Addressing Class Imbalance With SMOTE-ENN

Bounab,

Zarour,

Guelib

et al. 2024

IEEE Access

View full text Add to dashboard Cite

The healthcare fraud detection field is constantly evolving and faces significant challenges, particularly when addressing imbalanced data issues. Previous studies mainly focused on traditional machine learning (ML) techniques, often struggling with imbalanced data. This problem arises in various aspects. It includes the risk of overfitting with Random Oversampling (ROS), noise introduction by the Synthetic Minority Oversampling Technique (SMOTE), and potential crucial information loss with Random Undersampling (RUS). Moreover, improving model performance, exploring hybrid resampling techniques, and enhancing evaluation metrics are crucial for achieving higher accuracy with imbalanced datasets. In this paper, we present a novel approach to tackle the issue of imbalanced datasets in healthcare fraud detection, with a specific focus on the Medicare Part B dataset. First, we carefully extract the categorical feature "Provider Type" from the dataset. This allows us to generate new, synthetic instances by randomly replicating existing types, thereby increasing the diversity within the minority class. Then, we apply a hybrid resampling method named SMOTE-ENN, which combines the Synthetic Minority Over-sampling Technique (SMOTE) with Edited Nearest Neighbors (ENN). This method aims to balance the dataset by generating synthetic samples and removing noisy data to improve the accuracy of the models. We use six machine learning (ML) models to categorize the instances. When evaluating performance, we rely on common metrics like accuracy, F1 score, recall, precision, and the AUC-ROC curve. We highlight the significance of the Area Under the Precision-Recall Curve (AUPRC) for assessing performance in imbalanced dataset scenarios. The experiments show that Decision Trees (DT) outperformed all the classifiers, achieving a score of 0.99 across all metrics.

show abstract

Building prediction models and discovering important factors of health insurance fraud using machine learning methods

Cited by 6 publications

References 34 publications

Sustainable Financial Fraud Detection Using Garra Rufa Fish Optimization Algorithm with Ensemble Deep Learning

Sustainable Financial Fraud Detection Using Garra Rufa Fish Optimization Algorithm with Ensemble Deep Learning

Detection of Health Insurance Fraud using Bayesian Optimized XGBoost

Enhancing Medicare Fraud Detection Through Machine Learning: Addressing Class Imbalance With SMOTE-ENN

Contact Info

Product

Resources

About