2017 IEEE International Conference on Big Data (Big Data) 2017
DOI: 10.1109/bigdata.2017.8258338
|View full text |Cite
|
Sign up to set email alerts
|

Big data machine learning using apache spark MLlib

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
42
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 80 publications
(43 citation statements)
references
References 36 publications
0
42
0
Order By: Relevance
“…FlinkML library includes implementations of k-Means clustering algorithm, logistic regression, and Alternating Least Squares (ALS) for recommendation [11]. Spark has more efficient set of machine learning algorithms such as Spark MLlib [6] and MLI [51]. Spark MLlib is a scalable and fast library that is suitable for general needs and most areas of machine learning.…”
Section: Machine Learning Algorithmsmentioning
confidence: 99%
“…FlinkML library includes implementations of k-Means clustering algorithm, logistic regression, and Alternating Least Squares (ALS) for recommendation [11]. Spark has more efficient set of machine learning algorithms such as Spark MLlib [6] and MLI [51]. Spark MLlib is a scalable and fast library that is suitable for general needs and most areas of machine learning.…”
Section: Machine Learning Algorithmsmentioning
confidence: 99%
“…As a result of this implementation, they proved that their new Smart-MLlib library scaled well than Spark's MLlib for each evaluation. Applying machine learning on a large and complex dataset requires a considerable number of physical resources to process this data, in [25], the authors explored Apache Spark MLlib version 2.0 as an open-source, distributed, scalable, and platform independent Machine Learning library, and they performed different real-world machine learning experiments to evaluate the qualitative and quantitative attributes of the platform. Alternating direction method of multipliers (ADMM) [26], it is a method used to solve a generic convex problem for most machine learning algorithms, this solution helps to transform the problem to an iterative system of linear equations, the authors implemented ADMM in Apache Spark and they compared this solution with MLlib then they showed that ADMM solution is like an alternative to MLlib for big-data problems, this approach has the added advantage of machine learning algorithms.…”
Section: Performance Evaluation Of Apache Spark Through Machine Learnmentioning
confidence: 99%
“…They found that the SVM is more accurate in the condition of total average. However, M. Assefi and et al, 2017 [22] explored some views for growing the form of the Apache Spark MLlib 2.0 as an open source, accessible and achieve many machine learning tests that related to the real world to inspect the attribute characteristics. Also presents a comparison among spark and Weka with proving the advantages of spark over the Weka in many sides like the performance and it is efficient dealing with a huge amount of data.…”
Section: Related Workmentioning
confidence: 99%