2022
DOI: 10.48550/arxiv.2203.14889
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

LOCAT: Low-Overhead Online Configuration Auto-Tuning of Spark SQL Applications

Abstract: Spark SQL has been widely deployed in industry but it is challenging to tune its performance. Recent studies try to employ machine learning (ML) to solve this problem. They however suffer from two drawbacks. First, it takes a long time (high overhead) to collect training samples. Second, the optimal configuration for one input data size of the same application might not be optimal for others.To address these issues, we propose a novel Bayesian Optimization (BO) based approach named LOCAT to automatically tune … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(5 citation statements)
references
References 52 publications
0
5
0
Order By: Relevance
“…CherryPick [1] directly performs BO on a discretized search space. LOCAT [38] further combines dynamic sensitivity analysis and datasize-aware Gaussian process (GP) to perform optimization on important parameters. Despite the competitive converged results, the aforementioned methods suffer from the re-optimization issue [23], which is, the performance model needs retraining and still requires a number of online configuration evaluations for each coming task.…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…CherryPick [1] directly performs BO on a discretized search space. LOCAT [38] further combines dynamic sensitivity analysis and datasize-aware Gaussian process (GP) to perform optimization on important parameters. Despite the competitive converged results, the aforementioned methods suffer from the re-optimization issue [23], which is, the performance model needs retraining and still requires a number of online configuration evaluations for each coming task.…”
Section: Related Workmentioning
confidence: 99%
“…Benchmarks. For end-to-end comparison, we follow LOCAT [38] and use three SQL-related tasks from the widely used Spark benchmark HiBench [16]: (1) 'Join' is a query that executes in two phases: if M β‰  βˆ… then 3:…”
Section: Setupsmentioning
confidence: 99%
See 3 more Smart Citations