2018
DOI: 10.14569/ijacsa.2018.091196
|View full text |Cite
|
Sign up to set email alerts
|

Predicting Potential Banking Customer Churn using Apache Spark ML and MLlib Packages: A Comparative Study

Abstract: This study was conducted based on an assumption that Spark ML package has much better performance and accuracy than Spark MLlib package in dealing with big data. The used dataset in the comparison is for bank customers transactions. The Decision tree algorithm was used with both packages to generate a model for predicting the churn probability for bank customers depending on their transactions data. Detailed comparison results were recorded and conducted that the ML package and its new DataFrame-based APIs hav… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
17
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 20 publications
(19 citation statements)
references
References 5 publications
0
17
0
1
Order By: Relevance
“…The authors suggested Naïve Bayes is superior than other classifiers and then the size of the dataset could influence the output of the classifiers. Sayed and et al [11] discovered in their paper that the Spark ML has an attraction over Spark MLlib in the performance and accuracy of big data analytics problems. Al-Saqqa and et al [12] examined about sentiment classification of big data using Spark's MLlib.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The authors suggested Naïve Bayes is superior than other classifiers and then the size of the dataset could influence the output of the classifiers. Sayed and et al [11] discovered in their paper that the Spark ML has an attraction over Spark MLlib in the performance and accuracy of big data analytics problems. Al-Saqqa and et al [12] examined about sentiment classification of big data using Spark's MLlib.…”
Section: Related Workmentioning
confidence: 99%
“…This creates CNN more analogous to the Network of biological neurons and decreases the complexity of both the weight and the network model. A fully connected soft-max layer is utilized as the classification layer for sentence level sentiment classification in CNN proposed by Kim [11]. This classification layer, however, has become too simple for the task of classifying sentiments.…”
Section: Proposed Cnn-svm Using Spark DLmentioning
confidence: 99%
“…H. Sayedand et al, 2018 [19] in their paper compared the hypothesis that the Spark ML bundle has preference over the Spark MLlib bundle in the issues of performance and accurateness when dealing with big data. They discovered that the MLlib is better in the training time operation and vice versa in the evaluation time operation.…”
Section: Related Workmentioning
confidence: 99%
“…The customer churn analysis for the banking application is performed using Spark ML package by dealing with big data. The prediction of customer churn is done based on their transaction data [2]. The customer churn analysis can be related to customer relationship management where the bank customers are converted into competitors.…”
Section: A Literature Reviewmentioning
confidence: 99%