2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID) 2020
DOI: 10.1109/ccgrid49817.2020.00-61
|View full text |Cite
|
Sign up to set email alerts
|

Salamander: a Holistic Scheduling of MapReduce Jobs on Ephemeral Cloud Resources

Abstract: Most cloud data centers are over-provisioned and underutilized, primarily to handle peak loads and sudden failures. This has motivated many researchers to reclaim the unused resources, which are by nature ephemeral, to run data-intensive applications at a lower cost. Hadoop MapReduce is one of those applications. However, it was designed on the assumption that resources are available as long as users pay for the service. In order to make it possible for Hadoop to run on unused (ephemeral) resources, we have de… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
2
2

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(13 citation statements)
references
References 22 publications
0
13
0
Order By: Relevance
“…We simulated MapReduce task scheduler (Hadoop 2.7.0) to evaluate the proposed methodology on Ubuntu server with 12-core CPU (hyper-threaded), 64 GB memory, storage 4 x 1 TB HDD, and disk bandwidth rate 100 MB maximum. We compared IDLACO with classical fair scheduler and Holistic scheduler [16] based on parameters, such as the This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited.…”
Section: Results and Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…We simulated MapReduce task scheduler (Hadoop 2.7.0) to evaluate the proposed methodology on Ubuntu server with 12-core CPU (hyper-threaded), 64 GB memory, storage 4 x 1 TB HDD, and disk bandwidth rate 100 MB maximum. We compared IDLACO with classical fair scheduler and Holistic scheduler [16] based on parameters, such as the This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited.…”
Section: Results and Analysismentioning
confidence: 99%
“…The proposed approach largely relies on data locality to minimize the job latency with frequent replications of data blocks on demand. A holistic scheduler was designed by Mohamed Handaoui et al in [16] to improve resource utilization and job latency. It consists of three components: resource utilization prediction, determining data local executions, minimizing interferences from co-locating workloads.…”
Section: Literature Surveymentioning
confidence: 99%
“…These predictions are used to reclaim unused resources in order to be allocated to customers. The prediction shows good general accuracy and was used in other studies for scheduling big data applications [8], [9]. We followed 2 steps in this section: 1) Using the predictions alongside the in-production traces, we analyzed the prediction errors on different granularities: datacenter, host, resource metric.…”
Section: Motivationmentioning
confidence: 99%
“…The safety margin may be a static value, that is a fixed proportion of resources applied all the time, for all hosts and resource metrics. This strategy was used in Cuckoo [8] and Salamander [9] where fixed proportions were empirically tested to select the best one. Although this strategy does decrease potential SLA violations, a substantial amount of resources remained unused due to resource usage overestimations [9].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation