Proceedings of the 25th ACM International on Conference on Information and Knowledge Management 2016
DOI: 10.1145/2983323.2983647
|View full text |Cite
|
Sign up to set email alerts
|

An Experimental Comparison of Iterative MapReduce Frameworks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(2 citation statements)
references
References 12 publications
0
2
0
Order By: Relevance
“…On the other hand, Apache Spark uses read-only cached version of objects (resilient distributed dataset) which can be reused in parallel operations, thus reducing the performance overhead during iterative computation. Lee et al [48] evaluated five systems including Hadoop and Spark over various workloads to compare against four iterative algorithms. The experimentation was performed on Amazon EC2 cloud.…”
Section: Machine Learning and Iterative Tasks Supportmentioning
confidence: 99%
“…On the other hand, Apache Spark uses read-only cached version of objects (resilient distributed dataset) which can be reused in parallel operations, thus reducing the performance overhead during iterative computation. Lee et al [48] evaluated five systems including Hadoop and Spark over various workloads to compare against four iterative algorithms. The experimentation was performed on Amazon EC2 cloud.…”
Section: Machine Learning and Iterative Tasks Supportmentioning
confidence: 99%
“…Unlike Hadoop, a widely used open-source implementation of MapReduce, RDD partitions are cached in memory or on disks of each worker in the cluster. Due to the in-memory caching, Spark shows a good performance for iterative computation [31,46] which is necessary for graph mining and machine learning tasks. However, Spark still requires disk I/O [38] since its typical operations with shu ing including join and groupBy operations need to access disks for external-sort.…”
Section: Mapreduce and Sparkmentioning
confidence: 99%