2012 IEEE Fifth International Conference on Cloud Computing 2012
DOI: 10.1109/cloud.2012.92
|View full text |Cite
|
Sign up to set email alerts
|

Center-of-Gravity Reduce Task Scheduling to Lower MapReduce Network Traffic

Abstract: MapReduce is by far one of the most successful realizations of large-scale data-intensive cloud computing platforms. MapReduce automatically parallelizes computation by running multiple map and/or reduce tasks over distributed data across multiple machines.Hadoop is an open source implementation of MapReduce. When Hadoop schedules reduce tasks, it neither exploits data locality nor addresses partitioning skew present in some MapReduce applications. This might lead to increased cluster network traffic. In this … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
37
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 71 publications
(37 citation statements)
references
References 23 publications
0
37
0
Order By: Relevance
“…Our work is the first step in expanding on some of their ideas. Other research has focused on the reduce task scheduling [14,15] and the partitioning function [19]. As these works look to schedule tasks and partitions to improve shuffle traffic performance, we look to schedule the network to optimize application performance.…”
Section: Related Workmentioning
confidence: 99%
“…Our work is the first step in expanding on some of their ideas. Other research has focused on the reduce task scheduling [14,15] and the partitioning function [19]. As these works look to schedule tasks and partitions to improve shuffle traffic performance, we look to schedule the network to optimize application performance.…”
Section: Related Workmentioning
confidence: 99%
“…As the MapReduce distributed computations were analyzed as a divisible load scheduling problem [21], several classes of algorithms were proposed and examined for scheduling divisible loads on a heterogeneous system with memory limits [22]. Some task scheduling algorithms are to release the data communication among remote slots, e.g., the center-of-gravity reduce scheduler is a localityaware and skew-aware reduce task scheduler for saving MapReduce network traffic [23], and MaRCO employs eager reduce to process partial data from some map tasks while overlapping with other map tasks' communication [6].…”
Section: Related Workmentioning
confidence: 99%
“…The M. Hammoud, M. Rehman, et al developed Center-of-Gravity Reduce Scheduling Algorithm [45]. This technique to designs a locality-aware, skew-aware reduce task scheduler for stop MapReduce network traffic.…”
Section: Center-of-gravity Reduce Scheduling Algorithmmentioning
confidence: 99%