2022
DOI: 10.1371/journal.pone.0278095
|View full text |Cite
|
Sign up to set email alerts
|

A novel customer churn prediction model for the telecommunication industry using data transformation methods and feature selection

Abstract: Customer churn is one of the most critical issues faced by the telecommunication industry (TCI). Researchers and analysts leverage customer relationship management (CRM) data through the use of various machine learning models and data transformation methods to identify the customers who are likely to churn. While several studies have been conducted in the customer churn prediction (CCP) context in TCI, a review of performance of the various models stemming from these studies show a clear room for improvement. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 34 publications
0
7
0
Order By: Relevance
“…When training over the data which contains the class imbalance problem, the ML models will overclassify the majority class. Mostly, the classifiers focus on majority class rather than misclassifying or ignoring the minority class ( Kaur, Pannu & Malhi, 2020 ; Sana et al, 2022 ; Saha et al, 2023 ). Therefore, for acquiring better and accurate results, we need to handle the class imbalance problem.…”
Section: Methodsmentioning
confidence: 99%
“…When training over the data which contains the class imbalance problem, the ML models will overclassify the majority class. Mostly, the classifiers focus on majority class rather than misclassifying or ignoring the minority class ( Kaur, Pannu & Malhi, 2020 ; Sana et al, 2022 ; Saha et al, 2023 ). Therefore, for acquiring better and accurate results, we need to handle the class imbalance problem.…”
Section: Methodsmentioning
confidence: 99%
“…However, in the case of class imbalance, the accuracy may be affected by the uneven distribution of categories. The F1 score, denoting the harmonic mean of precision and recall [39], is a comprehensive evaluation metric, especially well-suited for dealing with unbalanced datasets. A superior F1 score generally suggests that the model maintains a more effective equilibrium between recall and precision [35].…”
Section: Evaluation Measuresmentioning
confidence: 99%
“…A superior F1 score generally suggests that the model maintains a more effective equilibrium between recall and precision [35]. A perfect model has an F1 score of 1 [39].…”
Section: Evaluation Measuresmentioning
confidence: 99%
“…In literature [10], a credit default prediction model was developed using GBDT and the K-means SMOTE oversampling method was used to address the imbalance in the data set, while the original hypothesis was rejected with a p-value < 0.001 using one-way analysis of variance, confirming the statistical significance of the improved performance of the proposed model. Literature [11] uses univariate techniques for feature selection in the customer churn domain and uses a grid search approach to select the optimal hyperparameters for the optimal model GDBT, demonstrating the benefits of applying data transformation methods and feature selection when training an optimized CCP model. Literature [12] proposes a default prediction model based on decision tree model using XGBoost model in integrated learning for accurate prediction of customer default in P2P lending, and also applies feature ranking based on learning model to P2P lending credit data with hyperparameter optimization for individual classifiers.…”
Section: Customer Churn Predictionmentioning
confidence: 99%
“…Although the above studies have contributed to customer churn prediction, most of the current studies on customer churn prediction have used ensemble learning methods to construct customer churn models, for example, in the literature [10][11][12][13][14][15][16], ensemble learning has been used to construct the corresponding models. The ensemble learning approach, as a black box model with high complexity, cannot justify the prediction results of the models used.…”
Section: Customer Churn Predictionmentioning
confidence: 99%