2016 IEEE 9th International Conference on Cloud Computing (CLOUD) 2016
DOI: 10.1109/cloud.2016.0034
|View full text |Cite
|
Sign up to set email alerts
|

Stage Aware Performance Modeling of DAG Based in Memory Analytic Platforms

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
16
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
1
1

Relationship

4
3

Authors

Journals

citations
Cited by 31 publications
(16 citation statements)
references
References 18 publications
0
16
0
Order By: Relevance
“…Cloud providers offer VMs of different capacity and cost. Given the complexity of virtualized systems and multiple bottleneck switches that occur during Big Data applications execution, very often the largest VM available is not the best choice from either the performance or performance/cost ratio perspective [49,19]. Through a search space exploration, our approach determines the optimal VM type and instances number, considering also specific Cloud provider pricing models (namely reserved and spot instances [1]).…”
Section: Introductionmentioning
confidence: 99%
“…Cloud providers offer VMs of different capacity and cost. Given the complexity of virtualized systems and multiple bottleneck switches that occur during Big Data applications execution, very often the largest VM available is not the best choice from either the performance or performance/cost ratio perspective [49,19]. Through a search space exploration, our approach determines the optimal VM type and instances number, considering also specific Cloud provider pricing models (namely reserved and spot instances [1]).…”
Section: Introductionmentioning
confidence: 99%
“…Because of this, predicting the execution time of Hadoop jobs is usually done empirically through experimentation, requiring a costly setup [15]. An alternative is to develop models for predicting performance.…”
Section: Introductionmentioning
confidence: 99%
“…This idea of profiling is also used in [16,17]. Task durations are obtained from execution logs, and on average, each query runs 20 times.…”
Section: Composite Dag Modelmentioning
confidence: 99%
“…Other more recent models, presented in this area are simulation-based models for which analysis is time consuming and less scalable [13,14]. Methods based on machine learning are good for interpolation, but suffer from low generality and insight [15,16,17]. Moreover, machine learning needs costly cluster setup to study historical logs of past executions.…”
Section: Introductionmentioning
confidence: 99%