Efficient Resampling for Fraud Detection During Anonymised Credit Card Transactions with Unbalanced Datasets

Mrozek, Petr; Panneerselvam, John; Bagdasar, Ovidiu

doi:10.1109/ucc48980.2020.00067

Cited by 18 publications

(8 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This section compares the proposed method with wellperforming methods in recent scholarly articles, including a weighted extreme learning machine (Weighted ELM) [91], an optimized [92], a deep neural network (DNN) based classifier [93], a cost-sensitive SVM [94], a neural network ensemble [95], a random forest-based genetic algorithm wrapper method (GA-RF) [42], a method that sequentially combines the C4.5 and naïve Bayes classifiers [96], a dynamic weighted ensemble technique using Markov Chain [97], a model developed using random forest algorithm and SMOTE based resampling (RF-SMOTE) [98], an XGBoost model with SMOTE based resampling [99], an LSTM ensemble with SMOTE-ENN [9], a comparison of SMOTE and ADASYN based resampling with a DNN classifier [4], and an ANN model with random undersampling (RUS) The stacking-based DL ensemble obtained optimal performance in comparison with other well-performing methods in Table 3, reflecting the proposed method's robustness. Meanwhile, it would be beneficial to observe how the proposed approach would perform using a different dataset.…”

Section: Comparison With State-of-the-art Methodsmentioning

confidence: 99%

A Deep Learning Ensemble With Data Resampling for Credit Card Fraud Detection

Mienye

Sun

2023

IEEE Access

View full text Add to dashboard Cite

Credit cards play an essential role in today's digital economy, and their usage has recently grown tremendously, accompanied by a corresponding increase in credit card fraud. Machine learning (ML) algorithms have been utilized for credit card fraud detection. However, the dynamic shopping patterns of credit card holders and the class imbalance problem have made it difficult for ML classifiers to achieve optimal performance. In order to solve this problem, this paper proposes a robust deep-learning approach that consists of long short-term memory (LSTM) and gated recurrent unit (GRU) neural networks as base learners in a stacking ensemble framework, with a multilayer perceptron (MLP) as the meta-learner. Meanwhile, the hybrid synthetic minority oversampling technique and edited nearest neighbor (SMOTE-ENN) method is employed to balance the class distribution in the dataset. The experimental results showed that combining the proposed deep learning ensemble with the SMOTE-ENN method achieved a sensitivity and specificity of 1.000 and 0.997, respectively, which is superior to other widely used ML classifiers and methods in the literature.

show abstract

Section: Comparison With State-of-the-art Methodsmentioning

confidence: 99%

A Deep Learning Ensemble With Data Resampling for Credit Card Fraud Detection

Mienye

Sun

2023

IEEE Access

View full text Add to dashboard Cite

show abstract

“…Several studies have explored imbalanced datasets in the context of different fraudulent cases, utilizing various resampling techniques and evaluation metrics (Rubaidi et al, 2022;Chen et al, 2021;Li et al, 2021;Mrozek et al, 2020;Bauder et al, 2018). Among the techniques used for handling imbalanced data were Random Undersampling (RUS), Random Oversampling (ROS), SMOTE, Borderline-SMOTE, Adaptive Synthetic Sampling (ADASYN), and cost-sensitive learning.…”

Section: Handling Of Imbalanced Datamentioning

confidence: 99%

Machine Learning Approach in Predicting Fraudulent Job Advertisement

Mohd Hanif,

Maarop,

Kamaruddin

et al. 2024

IJARBSS

View full text Add to dashboard Cite

As the world population grows, the demand for workers increases, leading to a rise in online job advertisements to connect employers with potential employees on a national scale. However, this shift also brings the risk of falling victim to fraud. Reported commercial crimes in Malaysia saw a 15.3% increase in 2021, with fraud being the highest among them. Several studies have proposed Machine Learning models to classify genuine and fraudulent job advertisements, but the analysis of certain techniques remains limited. The paper aims to develop a predictive model for identifying fraudulent job advertisements using selected features from imbalanced and balanced datasets. The Employment Scam Aegean Dataset was utilized to build Machine Learning classification models using Logistic Regression, Support Vector Machine, Decision Tree, and Naïve Bayes algorithms. These models were combined with different vectorizers like Term Frequency-Inverse Document Frequency, Bag of Words, and Hash. The Decision Tree model with Bag of Words vectorizer on a balanced dataset outperformed other models, achieving an accuracy of 0.705, precision of 0.73, recall of 0.70, F1-score of 0.71, and Area Under Curve score of 0.68. This model shows promise in effectively identifying fraudulent job advertisements, safeguarding job seekers from scams in the online job market.

show abstract

“…However, it is necessary to compare our approach with existing credit card fraud detection methods in the literature. The methods include the following: the sequential combination of C4.5 decision tree and naïve Bayes (NB) [5], a light gradient boosting machine (LightGBM) with a Bayesian-based hyperparameter optimization algorithm [14], a light gradient boosting machine (LightGBM) with a Bayesian-based hyperparameter optimization algorithm [14], a cost-sensitive SVM (CS SVM) [6], an optimized random forest (RF) classifier [34], a deep neural network (DNN) [35], a random forest classifier with SMOTE data resampling [36], an improved AdaBoost classifier with principal component analysis (PCA) and SMOTE method [37], a cost-sensitive neural network ensemble (CS-NNE) [38], a stochastic ensemble classifier operating in a discretized feature space [39], a model based on overfitting-cautious heterogeneous ensemble (OCHE) [40], a dynamic weighted ensemble technique using Markov Chain (DWE-MC) [41], and an extreme gradient boosting (XGBoost) ensemble classifier with SMOTE resampling technique [42].…”

Section: B Classifiers Performance After Data Resamplingmentioning

confidence: 99%

“…Method Sensitivity Specificity AUC Kalid et al [5] C4.5+NB 0.872 1.000 -Taha et al [14] LightGBM --0.928 Makki et al [6] CS SVM 0.650 -0.620 Khatri et al [34] Optimized Random forest 0.782 --Alkhatib et al [35] DNN 0.955 -0.990 Mrozek et al [36] Random forest + SMOTE 0.829 -0.910 Zhou et al [37] AdaBoost + SMOTE + PCA --0.965 Yotsawat et al [38] CS-NNE -0.936 0.980 Carta et al [39] Stochastic Ensemble Classifier 0.915 -0.876 Xia et al [40] OCHE --0.937 Feng et al [41] DWE-MC --0.66 Xie et al [42] XGBoost…”

Section: Referencementioning

confidence: 99%

A Neural Network Ensemble With Feature Engineering for Improved Credit Card Fraud Detection

et al. 2022

View full text Add to dashboard Cite

Recent advancements in electronic commerce and communication systems have significantly increased the use of credit cards for both online and regular transactions. However, there has been a steady rise in fraudulent credit card transactions, costing financial companies huge losses every year. The development of effective fraud detection algorithms is vital in minimizing these losses, but it is challenging because most credit card datasets are highly imbalanced. Also, using conventional machine learning algorithms for credit card fraud detection is inefficient due to their design, which involves a static mapping of the input vector to output vectors. Therefore, they cannot adapt to the dynamic shopping behavior of credit card clients. This paper proposes an efficient approach to detect credit card fraud using a neural network ensemble classifier and a hybrid data resampling method. The ensemble classifier is obtained using a long short-term memory (LSTM) neural network as the base learner in the adaptive boosting (AdaBoost) technique. Meanwhile, the hybrid resampling is achieved using the synthetic minority oversampling technique and edited nearest neighbor (SMOTE-ENN) method. The effectiveness of the proposed method is demonstrated using publicly available real-world credit card transaction datasets. The performance of the proposed approach is benchmarked against the following algorithms: support vector machine (SVM), multilayer perceptron (MLP), decision tree, traditional AdaBoost, and LSTM. The experimental results show that the classifiers performed better when trained with the resampled data, and the proposed LSTM ensemble outperformed the other algorithms.

show abstract

Efficient Resampling for Fraud Detection During Anonymised Credit Card Transactions with Unbalanced Datasets

Cited by 18 publications

References 28 publications

A Deep Learning Ensemble With Data Resampling for Credit Card Fraud Detection

A Deep Learning Ensemble With Data Resampling for Credit Card Fraud Detection

Machine Learning Approach in Predicting Fraudulent Job Advertisement

A Neural Network Ensemble With Feature Engineering for Improved Credit Card Fraud Detection

Contact Info

Product

Resources

About