2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (Ccgrid 2012) 2012
DOI: 10.1109/ccgrid.2012.122
|View full text |Cite
|
Sign up to set email alerts
|

Maestro: Replica-Aware Map Scheduling for MapReduce

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
50
0

Year Published

2013
2013
2018
2018

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 87 publications
(51 citation statements)
references
References 15 publications
1
50
0
Order By: Relevance
“…Providing locality for tasks is crucial for performance of Hadoop in large clusters because network bisection bandwidth becomes a bottleneck [10] [16]. Besides, since most of the Hadoop usage is for small jobs (jobs with a small number of map tasks) [26], it is difficult for a small job to obtain slots on nodes with local data.…”
Section: Discussion On Hadoop Performance Under Failurementioning
confidence: 99%
“…Providing locality for tasks is crucial for performance of Hadoop in large clusters because network bisection bandwidth becomes a bottleneck [10] [16]. Besides, since most of the Hadoop usage is for small jobs (jobs with a small number of map tasks) [26], it is difficult for a small job to obtain slots on nodes with local data.…”
Section: Discussion On Hadoop Performance Under Failurementioning
confidence: 99%
“…The Maestro is a replica aware scheduler algorithm introduced by S. Ibrahim, H. Jin, et al This technique defers the difficulty of non-local tasks that depend on replica of map tasks [43]. The Maestro keeps track of the chunks and duplicate locations, along with the number of other chunks hosted by each node.…”
Section: Maestro Scheduling Algorithmmentioning
confidence: 99%
“…Replication has become an essential feature in storage systems and is extensively leveraged in cloud environments [3][4] [5]. It is the main reason behind several features such as fast accesses, enhanced performance, and high availability.…”
Section: Figure 1 Leveraging Geographically-distributed Data Replicamentioning
confidence: 99%