Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data 2014
DOI: 10.1145/2588555.2595630
|View full text |Cite
|
Sign up to set email alerts
|

Major technical advancements in apache hive

Abstract: Apache Hive is a widely used data warehouse system for Apache Hadoop, and has been adopted by many organizations for various big data analytics applications. Closely working with many users and organizations, we have identified several shortcomings of Hive in its file formats, query planning, and query execution, which are key factors determining the performance of Hive. In order to make Hive continuously satisfy the requests and requirements of processing increasingly high volumes data in a scalable and effic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
48
0
1

Year Published

2015
2015
2022
2022

Publication Types

Select...
3
3
3

Relationship

0
9

Authors

Journals

citations
Cited by 124 publications
(51 citation statements)
references
References 29 publications
0
48
0
1
Order By: Relevance
“…In big data query systems, column-wise compression [26] [27] [28] is commonly seen. [26] uses a method that simply compresses column data into a series of self-contained blocks.…”
Section: Data Compressionmentioning
confidence: 99%
See 1 more Smart Citation
“…In big data query systems, column-wise compression [26] [27] [28] is commonly seen. [26] uses a method that simply compresses column data into a series of self-contained blocks.…”
Section: Data Compressionmentioning
confidence: 99%
“…Another scheme, Llama [27] allows groupedcolumn compression with different schemes best suitable for groups of records, and builds indices for compression blocks. Hive [28] is a data management system that allows column-wise compression. Data is organized like database tables in Optimized Record Columnar (ORC) format which is specific to Hive.…”
Section: Data Compressionmentioning
confidence: 99%
“…Its recent version has shifted from the classic MapReduce to Tez platform [40], which has made it practical for interactive analytics by accelerating it many times. ORC (Optimized Row Column) is used in Hive [41].…”
Section: Conceptual Big Data Architecture For the Oil And Gas Industrymentioning
confidence: 99%
“…Apache provides several database oriented frameworks like Pig [23], Hive [24] and HBase [25]. Apache Pig consist of its own data flow language: PigLatin.…”
Section: Pig Hive and Hbasementioning
confidence: 99%