2016 IEEE International Conference on Big Data (Big Data) 2016
DOI: 10.1109/bigdata.2016.7840751
|View full text |Cite
|
Sign up to set email alerts
|

The state of SQL-on-Hadoop in the cloud

Abstract: Managed Hadoop in the cloud, especially SQL-on-Hadoop, has been gaining attention recently. On Platform-as-a-Service (PaaS), analytical services like Hive and Spark come preconfigured for general-purpose and ready to use. Thus, giving companies a quick entry and on-demand deployment of ready SQL-like solutions for their big data needs. This study evaluates cloud services from an end-user perspective, comparing providers including: Microsoft Azure, Amazon Web Services, Google Cloud,\ud and Rackspace. The study … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
8
0
1

Year Published

2017
2017
2020
2020

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 8 publications
(11 citation statements)
references
References 13 publications
2
8
0
1
Order By: Relevance
“…The reason for Hive in GCD being at least twice as slow is due to GCD not using Tez as the execution engine, and using the legacy Map/Reduce engine. This result is consistent with previous work [16].…”
Section: Power Tests From 1gb To 10tbsupporting
confidence: 94%
See 3 more Smart Citations
“…The reason for Hive in GCD being at least twice as slow is due to GCD not using Tez as the execution engine, and using the legacy Map/Reduce engine. This result is consistent with previous work [16].…”
Section: Power Tests From 1gb To 10tbsupporting
confidence: 94%
“…We can also see a great variation of results in HDI with Spark, as with 16 streams is the slowest of the systems, but the fastest at 32 streams. This situation also highlights the variability of cloud results as we have studied previously in [16].…”
Section: Additional Experimentssupporting
confidence: 71%
See 2 more Smart Citations
“…A recent work by Poggi et al evaluated the Hive on MapReduce and Hive on Tez (with default ORC format configuration) performance on multiple cloud providers using the TPC‐H benchmark. The results showed that the price‐to‐performance ratio for the best cloud configuration is within a 30% cost difference for the 1TB scale.…”
Section: Background and Related Workmentioning
confidence: 99%