2012 SC Companion: High Performance Computing, Networking Storage and Analysis 2012
DOI: 10.1109/sc.companion.2012.151
|View full text |Cite
|
Sign up to set email alerts
|

Resource Management for Dynamic MapReduce Clusters in Multicluster Systems

Abstract: Abstract-State-of-the-art MapReduce frameworks such as Hadoop can easily scale up to thousands of machines and to large numbers of users. Nevertheless, some users may require isolated environments to develop their applications and to process their data, which calls for multiple deployments of MR clusters within the same physical infrastructure. In this paper, we design and implement a resource management system to facilitate the on-demand isolated deployment of MapReduce clusters in multicluster systems. Deplo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
6
2
1

Relationship

2
7

Authors

Journals

citations
Cited by 12 publications
(11 citation statements)
references
References 11 publications
(9 reference statements)
0
11
0
Order By: Relevance
“…For the experiments on Hadoop and YARN, we run 20 map tasks and 20 reduce tasks on the 20 computing nodes. Due to the settings used for Hadoop [33], the map phase will be completed in one wave; all the reduce tasks can also be finished in one wave, without any overlap with the map phase [38]. In Giraph, Stratosphere, and GraphLab, we set the parallelization degree to 20 tasks, also equal to the total number of computing nodes.…”
Section: A Basic Performance: Job Execution Timementioning
confidence: 99%
“…For the experiments on Hadoop and YARN, we run 20 map tasks and 20 reduce tasks on the 20 computing nodes. Due to the settings used for Hadoop [33], the map phase will be completed in one wave; all the reduce tasks can also be finished in one wave, without any overlap with the map phase [38]. In Giraph, Stratosphere, and GraphLab, we set the parallelization degree to 20 tasks, also equal to the total number of computing nodes.…”
Section: A Basic Performance: Job Execution Timementioning
confidence: 99%
“…In our previous work [12], we have found that the execution time of disk-intensive jobs increases with the ratio between transient and core nodes, while the performance of compute-intensive jobs is independent of the types of nodes.…”
Section: Node Typesmentioning
confidence: 88%
“…Koala is a resource manager which co-allocates processors, possibly from multiple clusters, to various HPC applications and to isolated MapReduce [12] frameworks. When resources are available, each framework may receive additional resources from Koala, but it is their decision to accept or reject them.…”
Section: Related Workmentioning
confidence: 99%
“…Omega [13] addresses resource allocations across applications and resolves conflicts by optimistic concurrency control. In contrast to above studies, Ghit et al [14] propose a resource management system to facilitate the deployment of MapReduce clusters in an on-demand fashion and with …”
Section: A Big Data Platformsmentioning
confidence: 99%