2019
DOI: 10.14778/3352063.3352112
|View full text |Cite
|
Sign up to set email alerts
|

Speedup your analytics

Abstract: Database and big data analytics systems such as Hadoop and Spark have a large number of configuration parameters that control memory distribution, I/O optimization, parallelism, and compression. Improper parameter settings can cause significant performance degradation and stability issues. However, regular users and even expert administrators struggle to understand and tune them to achieve good performance. In this tutorial, we review existing approaches on automatic parameter tuning for databases, Hadoop, and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
2
1

Relationship

1
9

Authors

Journals

citations
Cited by 42 publications
(7 citation statements)
references
References 21 publications
0
7
0
Order By: Relevance
“…In [67], Lee et al improved the locality of network and storage I/O operations on many-core systems running Big Data applications using Apache Hadoop MapReduce. In [68], Lu et al discussed the importance of proper parameter settings in high-performance database systems for Big Data. In [69], Zhang et al defined a new benchmark and a new set of tools for benchmarking database systems for Big Data applications.…”
Section: Enabling Technologies For Big Datamentioning
confidence: 99%
“…In [67], Lee et al improved the locality of network and storage I/O operations on many-core systems running Big Data applications using Apache Hadoop MapReduce. In [68], Lu et al discussed the importance of proper parameter settings in high-performance database systems for Big Data. In [69], Zhang et al defined a new benchmark and a new set of tools for benchmarking database systems for Big Data applications.…”
Section: Enabling Technologies For Big Datamentioning
confidence: 99%
“…Spark is characterized by its in-memory computation and high expressiveness [91]. Based on these capabilities, Spark has become a natural choice to support two components of Big Data in iterative and reactive applications [92]. Regarding the speed, Spark has been reported having ten times faster than MR on disk-resident tasks and a hundred times faster for the memory-resident task [93].…”
Section: Bim-iot and Big-data Principlementioning
confidence: 99%
“…With the native support from various cloud computing services, Spark-based big data analytics has been thriving in academia and industry. Nevertheless, managing the resources with proper configuration for Spark jobs remains challenging [21,37].…”
Section: Introductionmentioning
confidence: 99%