2021 IEEE International Conference on Cluster Computing (CLUSTER) 2021
DOI: 10.1109/cluster48925.2021.00052
|View full text |Cite
|
Sign up to set email alerts
|

Bellamy: Reusing Performance Models for Distributed Dataflow Jobs Across Contexts

Abstract: There may be differences between this version and the published version. You are advised to consult the publisher's version if you wish to cite from it.

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
19
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7

Relationship

7
0

Authors

Journals

citations
Cited by 17 publications
(19 citation statements)
references
References 20 publications
0
19
0
Order By: Relevance
“…They often use runtime data to predict the scale-out and runtime behavior of jobs. The runtime data can be gained either from dedicated profiling with a sample of the dataset or previous full executions [19], [20], [23], [24], [26], [27].…”
Section: A Offline Runtime Predictionmentioning
confidence: 99%
See 2 more Smart Citations
“…They often use runtime data to predict the scale-out and runtime behavior of jobs. The runtime data can be gained either from dedicated profiling with a sample of the dataset or previous full executions [19], [20], [23], [24], [26], [27].…”
Section: A Offline Runtime Predictionmentioning
confidence: 99%
“…Meanwhile, metrics from previous executions of a job are not always available. As a possible remedy for this issue, we previously proposed a system [27] that facilitates the global sharing of context aware runtime models, allowing for runtime prediction based 1 https://github.com/dos-group/enel-experiments on historical executions of a job by different users [23], [24]. Enel assumes a recurring job and thus, the initial profiling cost can be amortized over time.…”
Section: A Offline Runtime Predictionmentioning
confidence: 99%
See 1 more Smart Citation
“…Other approaches use runtime data to predict the scale-out and runtime behavior of jobs. This data is gained either from dedicated profiling or previous full executions [3], [12], [15]- [20].…”
Section: B Performance Model-basedmentioning
confidence: 99%
“…Many existing approaches iteratively search for suitable cluster configurations [7]- [11]. Several other approaches [3], [12]- [15] build runtime models, which are then used to evaluate possible configurations, including our previous work [16]- [20]. Here, training data for the runtime models is typically generated with dedicated profiling runs on reduced samples of the dataset, or historical runtime data is used.…”
Section: Introductionmentioning
confidence: 99%