2016
DOI: 10.1007/978-3-319-49748-8_6
|View full text |Cite
|
Sign up to set email alerts
|

Performance Evaluation of Spark SQL Using BigBench

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
10
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(10 citation statements)
references
References 11 publications
0
10
0
Order By: Relevance
“…Queries Q02 and Q30 achieve standard deviations varying between 7% and 16%, which will be explained in Section 5.4. All other queries have standard deviations around 10%, which indicates that SparkSQL is less stable than Hive as reported in the works of Ivanov and Beer . We believe this is also due to execution noise in the cluster affecting more SparkSQL times that are generally much shorter compared to Hive ones.…”
Section: Spark Sqlmentioning
confidence: 76%
See 1 more Smart Citation
“…Queries Q02 and Q30 achieve standard deviations varying between 7% and 16%, which will be explained in Section 5.4. All other queries have standard deviations around 10%, which indicates that SparkSQL is less stable than Hive as reported in the works of Ivanov and Beer . We believe this is also due to execution noise in the cluster affecting more SparkSQL times that are generally much shorter compared to Hive ones.…”
Section: Spark Sqlmentioning
confidence: 76%
“…This work is a continuation of a series of benchmark experiments conducted at the Frankfurt Big Data Lab.…”
Section: Introductionmentioning
confidence: 99%
“…Apache Spark [25] similar to Hive, Spark is another popular framework gaining momentum [12]. Spark is a processing engine which provided increased performance over the original Map/Reduce by leveraging in-memory computation.…”
Section: Background and Related Workmentioning
confidence: 99%
“…We believe that BigBench [5] is the current reference benchmark for such systems. Relating to BigBench query results, to day there are only a handful of official submissions are available [20], as well as a few publications with detailed per query characterization [2,12]. More established benchmarks i.e., TPC-H have been thoroughly analyzed, in work including the query their choke points as in "TPC-H Analyzed" [1].…”
Section: Related Workmentioning
confidence: 99%
“…[51]:•yarn.nodemanager.resource.memory-mb • yarn.nodemanager.resource.cpu-vcores • yarn.scheduler.maximum-allocation-mb • yarn.scheduler.minimum-allocation-mb • yarn.scheduler.maximum-allocation-vcores • yarn.scheduler.minimum-allocation-vcores…”
mentioning
confidence: 99%