Proceedings of the 16th Annual Middleware Conference 2015
DOI: 10.1145/2814576.2814730
|View full text |Cite
|
Sign up to set email alerts
|

Configuring Distributed Computations Using Response Surfaces

Abstract: Configuring large distributed computations is a challenging task. Efficiently executing distributed computations requires configuration tuning based on careful examination of application and hardware properties. Considering the large number of parameters and impracticality of using trial and error in a production environment, programmers tend to make these decisions based on their experience and rules of thumb. Such configurations can lead to underutilized and costly clusters, and missed deadlines. In this pap… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
7
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(8 citation statements)
references
References 39 publications
0
7
0
Order By: Relevance
“…ML-based methods aim to gain the best of both worlds by using experiential data (as benchmarks do) to model application behaviour on different deployment setups. A prominent trend in such studies is to focus on data analytics and MapReduce-style applications [7], [8], [38], [39], due to their operational footprint and having a recurrent workload pattern which is relatively easy to model. However, a common overhead here is training: significant data and time (which translates to cloud costs) are needed to train a model.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…ML-based methods aim to gain the best of both worlds by using experiential data (as benchmarks do) to model application behaviour on different deployment setups. A prominent trend in such studies is to focus on data analytics and MapReduce-style applications [7], [8], [38], [39], due to their operational footprint and having a recurrent workload pattern which is relatively easy to model. However, a common overhead here is training: significant data and time (which translates to cloud costs) are needed to train a model.…”
Section: Related Workmentioning
confidence: 99%
“…Indeed, ML-aided DSS have been demonstrated to provide behavioural and performance insights about application and deployment setup necessary to make optimal decisions, e.g. [4], [7], [8], [9]. A traditional ML approach follows the general steps of: collecting data, generating a learning model, fitting the model on training data, and assessing its accuracy on test data.…”
Section: Introductionmentioning
confidence: 99%
“…NoSQL system are also used to execute complex aggregation operations on the stored data using mapreduce techniques. Gencer et al [15] say that different hardware, system and software configuration, can have a significant impact on the time it takes to perform a mapreduce computation. Since the search space of the amount of possible configuration is very large, trying all different combinations on a real system is unfeasible, especially if hardware reconfiguration are required.…”
Section: Static Workload-based Optimizationmentioning
confidence: 99%
“…Indeed, ML-aided DSS have been demonstrated to provide behavioural and performance insights about application and deployment setup necessary to make optimal decisions, e.g. [41,9,20,1]. A traditional ML approach follows the general steps of: collecting data, generating a learning model, fitting the model on training data, and assessing its accuracy on test data.…”
Section: Introductionmentioning
confidence: 99%