2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) 2016
DOI: 10.1109/ccgrid.2016.10
|View full text |Cite
|
Sign up to set email alerts
|

OptEx: A Deadline-Aware Cost Optimization Model for Spark

Abstract: We present OptEx, a closed-form model of job execution on Apache Spark, a popular parallel processing engine. To the best of our knowledge, OptEx is the first work that analytically models job completion time on Spark. The model can be used to estimate the completion time of a given Spark job on a cloud, with respect to the size of the input dataset, the number of iterations, the number of nodes comprising the underlying cluster. Experimental results demonstrate that OptEx yields a mean relative error of 6% in… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
30
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 39 publications
(34 citation statements)
references
References 10 publications
0
30
0
Order By: Relevance
“…Some researchers have chosen to exclude workload allocation. Since Sidhanta et al [42] built their OptEx framework on the Spark framework, they relied on the existing built-in mechanisms for workload allocation. On the other hand, Mao et al [36] assumed prior knowledge regarding the fixed distribution of task allocations to each VM type.…”
Section: Literature Analysismentioning
confidence: 99%
See 4 more Smart Citations
“…Some researchers have chosen to exclude workload allocation. Since Sidhanta et al [42] built their OptEx framework on the Spark framework, they relied on the existing built-in mechanisms for workload allocation. On the other hand, Mao et al [36] assumed prior knowledge regarding the fixed distribution of task allocations to each VM type.…”
Section: Literature Analysismentioning
confidence: 99%
“…The majority of the existing work considers both constraints and objectives whilst making a scheduling decision. For instance, some researchers have focused on optimising cloud usage by minimising the monetary cost while satisfying the deadline constraint [18,20,21,26,27,29,33,37,39,40,41,42], which aims to achieve the desired performance with the minimal cost. Researchers [24,32,46] have addressed the problem of performance maximisation with a budget constraint with the objective of obtaining with the maximum performance within a budgetary constraint.…”
Section: Literature Analysismentioning
confidence: 99%
See 3 more Smart Citations