2013
DOI: 10.1016/j.jpdc.2012.12.012
|View full text |Cite
|
Sign up to set email alerts
|

MapReduce with communication overlap (MaRCO)

Abstract: MapReduce is a programming model from Google for cluster-based computing in domains such as search engines, machine learning, and data mining. MapReduce provides automatic data management and fault tolerance to improve programmability of clusters. MapReduce's execution model includes an all-map-to-all-reduce communication, called the shuffle, across the network bisection. Some MapReductions move large amounts of data (e.g., as much as the input data), stressing the bisection bandwidth and introducing significa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
26
0

Year Published

2015
2015
2018
2018

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 54 publications
(26 citation statements)
references
References 15 publications
0
26
0
Order By: Relevance
“…Among the workloads, "Grep" and "Select" are I/O intensive workloads, and "Aggregation" and "Join" are communication-intensive workloads. In addition, "Inverted Index" has been added as one of highly communication-intensive workloads [11]. We use a mixture of these workloads, and generate a random submission distribution similar to Zaharia et al [19], which is based on the Facebook trace.…”
Section: Evaluation Methodologymentioning
confidence: 99%
“…Among the workloads, "Grep" and "Select" are I/O intensive workloads, and "Aggregation" and "Join" are communication-intensive workloads. In addition, "Inverted Index" has been added as one of highly communication-intensive workloads [11]. We use a mixture of these workloads, and generate a random submission distribution similar to Zaharia et al [19], which is based on the Facebook trace.…”
Section: Evaluation Methodologymentioning
confidence: 99%
“…The MapReduce with communication overlap scheduling algorithm initiate by F. Ahmad, S. Lee, et al MapReduce with communication overlap scheduling to instate nearly full overlap through the modern scheme of including the sort and reduce in the overlap [44]. The elementary Hadoop data flow was modified assent the operation of Reduce tasks on fractional data.…”
Section: Mapreduce With Communication Overlap Scheduling Algorithmmentioning
confidence: 99%
“…Thus, overlapping the shuffle delay with mapper and reducer computations is performed to reduce the job execution time. (17) In summary, the above approaches tend to reduce the job execution time by finding the best practices for task assignment according to the type of job. Unfortunately, most of them assumed that the CNs are homogeneous and did not take the cloud heterogeneity and network dynamics into account.…”
Section: Performance Improvement Of Mapreducementioning
confidence: 99%