Configuring Distributed Computations Using Response Surfaces

Gencer, Adem Efe; Bindel, David; Sirer, Emin Gün; Renesse, Robbert van

doi:10.1145/2814576.2814730

Cited by 22 publications

(8 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…ML-based methods aim to gain the best of both worlds by using experiential data (as benchmarks do) to model application behaviour on different deployment setups. A prominent trend in such studies is to focus on data analytics and MapReduce-style applications [7], [8], [38], [39], due to their operational footprint and having a recurrent workload pattern which is relatively easy to model. However, a common overhead here is training: significant data and time (which translates to cloud costs) are needed to train a model.…”

Section: Related Workmentioning

confidence: 99%

“…Indeed, ML-aided DSS have been demonstrated to provide behavioural and performance insights about application and deployment setup necessary to make optimal decisions, e.g. [4], [7], [8], [9]. A traditional ML approach follows the general steps of: collecting data, generating a learning model, fitting the model on training data, and assessing its accuracy on test data.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Transferable Knowledge for Low-Cost Decision Making in Cloud Environments

Samreen

Blair

Elkhatib

2022

IEEE Trans. Cloud Comput.

View full text Add to dashboard Cite

Users of Infrastructure as a Service (IaaS) are increasingly overwhelmed with the wide range of providers and services offered by each provider. As such, many users select services based on description alone. An emerging alternative is to use a decision support system (DSS), which typically relies on gaining insights from observational data in order to assist a customer in making decisions regarding optimal deployment of cloud applications. The primary activity of such systems is the generation of a prediction model (e.g. using machine learning), which requires a significantly large amount of training data. However, considering the varying architectures of applications, cloud providers, and cloud offerings, this activity is not sustainable as it incurs additional time and cost to collect data to train the models. We overcome this through developing a Transfer Learning (TL) approach where knowledge (in the form of a prediction model and associated data set) gained from running an application on a particular IaaS is transferred in order to substantially reduce the overhead of building new models for the performance of new applications and/or cloud infrastructures. In this paper, we present our approach and evaluate it through extensive experimentation involving three real world applications over two major public cloud providers, namely Amazon and Google. Our evaluation shows that our novel two-mode TL scheme increases overall efficiency with a factor of 60% reduction in the time and cost of generating a new prediction model. We test this under a number of cross-application and cross-cloud scenarios.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Transferable Knowledge for Low-Cost Decision Making in Cloud Environments

Samreen

Blair

Elkhatib

2022

IEEE Trans. Cloud Comput.

View full text Add to dashboard Cite

show abstract

“…NoSQL system are also used to execute complex aggregation operations on the stored data using mapreduce techniques. Gencer et al [15] say that different hardware, system and software configuration, can have a significant impact on the time it takes to perform a mapreduce computation. Since the search space of the amount of possible configuration is very large, trying all different combinations on a real system is unfeasible, especially if hardware reconfiguration are required.…”

Section: Static Workload-based Optimizationmentioning

confidence: 99%

Automated Configuration of NoSQL Performance and Scalability Tactics for Data-Intensive Applications

Preuveneers

Joosen

2020

Informatics

View full text Add to dashboard Cite

This paper presents the architecture, implementation and evaluation of a middleware support layer for NoSQL storage systems. Our middleware automatically selects performance and scalability tactics in terms of application specific workloads. Enterprises are turning to NoSQL storage technologies for their data-intensive computing and analytics applications. Comprehensive benchmarks of different Big Data platforms can help drive decisions which solutions to adopt. However, selecting the best performing technology, configuring the deployment for scalability and tuning parameters at runtime for an optimal service delivery remain challenging tasks, especially when application workloads evolve over time. Our middleware solves this problem at runtime by monitoring the data growth, changes in the read-write-query mix at run-time, as well as other system metrics that are indicative of sub-optimal performance. Our middleware employs supervised machine learning on historic and current monitoring information and corresponding configurations to select the best combinations of high-level tactics and adapt NoSQL systems to evolving workloads. This work has been driven by two real world case studies with different QoS requirements. The evaluation demonstrates that our middleware can adapt to unseen workloads of data-intensive applications, and automate the configuration of different families of NoSQL systems at runtime to optimize the performance and scalability of such applications.

show abstract

“…Indeed, ML-aided DSS have been demonstrated to provide behavioural and performance insights about application and deployment setup necessary to make optimal decisions, e.g. [41,9,20,1]. A traditional ML approach follows the general steps of: collecting data, generating a learning model, fitting the model on training data, and assessing its accuracy on test data.…”

Section: Introductionmentioning

confidence: 99%

Transferable Knowledge for Low-cost Decision Making in Cloud Environments

Samreen¹,

Blair²,

Elkhatib³

2019

Preprint

View full text Add to dashboard Cite

Users of cloud computing are increasingly overwhelmed with the wide range of providers and services offered by each provider. As such, many users select cloud services based on description alone. An emerging alternative is to use a decision support system (DSS), which typically relies on gaining insights from observational data in order to assist a customer in making decisions regarding optimal deployment or redeployment of cloud applications. The primary activity of such systems is the generation of a prediction model (e.g. using machine learning), which requires a significantly large amount of training data. However, considering the varying architectures of applications, cloud providers, and cloud offerings, this activity is not sustainable as it incurs additional time and cost to collect training data and subsequently train the models. We overcome this through developing a Transfer Learning (TL) approach where the knowledge (in the form of the prediction model and associated data set) gained from running an application on a particular cloud infrastructure is transferred in order to substantially reduce the overhead of building new models for the performance of new applications and/or cloud infrastructures. In this paper, we present our approach and evaluate it through extensive experimentation involving three real world applications over two major public cloud providers, namely Amazon and Google. Our evaluation shows that our novel two-mode TL scheme increases overall efficiency with a factor of 60% reduction in the time and cost of generating a new prediction model. We test this under a number of cross-application and cross-cloud scenarios.

show abstract

Configuring Distributed Computations Using Response Surfaces

Cited by 22 publications

References 39 publications

Transferable Knowledge for Low-Cost Decision Making in Cloud Environments

Transferable Knowledge for Low-Cost Decision Making in Cloud Environments

Automated Configuration of NoSQL Performance and Scalability Tactics for Data-Intensive Applications

Transferable Knowledge for Low-cost Decision Making in Cloud Environments

Contact Info

Product

Resources

About