2017
DOI: 10.1007/s10586-017-1167-y
|View full text |Cite
|
Sign up to set email alerts
|

An experimental analysis of limitations of MapReduce for iterative algorithms on Spark

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 17 publications
0
2
0
Order By: Relevance
“…With the disk I/O-based operation and data locality principles, we argue that any algorithm that involves intensive iterations such as the subtree generalization can cause significant overheads at multiple places such as at Disk I/O, Network, and Scheduling [40].…”
Section: Iterationmentioning
confidence: 99%
“…With the disk I/O-based operation and data locality principles, we argue that any algorithm that involves intensive iterations such as the subtree generalization can cause significant overheads at multiple places such as at Disk I/O, Network, and Scheduling [40].…”
Section: Iterationmentioning
confidence: 99%
“…Multiple Spark jobs initiated by different threads may run concurrently within each Spark application which gets its own executor processes. Spark runs long-running processes and threads, which stay up through the entire duration of the application and execute tasks in multiple threads, to avoid the overhead of repeatedly invoking tasks [9,10]. Allocation of executor resources on the cluster can be controlled by Spark YARN client using the --num-executors option, which overrides Spark's built-in DRA mechanism [18].…”
Section: Spark Architecture and Resilient Distributed Dataset (Rdd)mentioning
confidence: 99%