Nowadays, customer churn has been reflected as one of the main concerns in the processes of the telecom sector, as it affects the revenue directly. Telecom companies are looking to design novel methods to identify the potential customer to churn. Hence, it requires suitable systems to overcome the growing churn challenge. Recently, integrating different clustering and classification models to develop hybrid learners (ensembles) has gained wide acceptance. Ensembles are getting better approval in the domain of big data since they have supposedly achieved excellent predictions as compared to single classifiers. Therefore, in this study, we propose a customer churn prediction (CCP) based on ensemble system fully incorporating clustering and classification learning techniques. The proposed churn prediction model uses an ensemble of clustering and classification algorithms to improve CCP model performance. Initially, few clustering algorithms such as k-means, k-medoids, and Random are employed to test churn prediction datasets. Next, to enhance the results hybridization technique is applied using different ensemble algorithms to evaluate the performance of the proposed system. Above mentioned clustering algorithms integrated with different classifiers including Gradient Boosted Tree (GBT), Decision Tree (DT), Random Forest (RF), Deep Learning (DL), and Naive Bayes (NB) are evaluated on two standard telecom datasets which were acquired from Orange and Cell2Cell. The experimental result reveals that compared to the bagging ensemble technique, the stacking-based hybrid model (k-medoids-GBT-DT-DL) achieve the top accuracies of 96%, and 93.6% on the Orange and Cell2Cell dataset, respectively. The proposed method outperforms conventional state-of-the-art churn prediction algorithms.
Mobile communication has become a dominant medium of communication over the past two decades. New technologies and competitors are emerging rapidly and churn prediction has become a great concern for telecom companies. A customer churn prediction model can provide the accurate identification of potential churners so that a retention solution may be provided to them. The proposed churn prediction model is a hybrid model that is based on a combination of clustering and classification algorithms using an ensemble. First, different clustering algorithms (i.e. K-means, K-medoids, X-means and random clustering) were evaluated individually on two churn prediction datasets. Then hybrid models were introduced by combining the clusters with seven different classification algorithms individually and then evaluations were performed using ensembles. The proposed research was evaluated on two different benchmark telecom data sets obtained from GitHub and Bigml platforms. The analysis of results indicated that the proposed model attained the highest prediction accuracy of 94.7% on the GitHub dataset and 92.43% on the Bigml dataset. State of the art comparison was also performed using the proposed model. The proposed model performed significantly better than state of the art churn prediction models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.