Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data 2011
DOI: 10.1145/1989323.1989447
|View full text |Cite
|
Sign up to set email alerts
|

Efficient processing of data warehousing queries in a split execution environment

Abstract: Hadapt is a start-up company currently commercializing the Yale University research project called HadoopDB. The company focuses on building a platform for Big Data analytics in the cloud by introducing a storage layer optimized for structured data and by providing a framework for executing SQL queries efficiently.This work considers processing data warehousing queries over very large datasets. Our goal is to maximize performance while, at the same time, not giving up fault tolerance and scalability. We analyz… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
35
0
3

Year Published

2012
2012
2018
2018

Publication Types

Select...
4
3
3

Relationship

0
10

Authors

Journals

citations
Cited by 68 publications
(38 citation statements)
references
References 18 publications
0
35
0
3
Order By: Relevance
“…Greenplum and Aster Data have added the ability to execute MapReduce-style functions over data stored in these systems. HadoopDB and split execution [6,10] explore on the architectural level exploiting hybrid MapReduce and relational database systems. Dremel, another project from Google [26], is worth mentioning as an example of a new generation of database systems that are massively distributed and run interactive queries on very large data sets.…”
Section: Related Workmentioning
confidence: 99%
“…Greenplum and Aster Data have added the ability to execute MapReduce-style functions over data stored in these systems. HadoopDB and split execution [6,10] explore on the architectural level exploiting hybrid MapReduce and relational database systems. Dremel, another project from Google [26], is worth mentioning as an example of a new generation of database systems that are massively distributed and run interactive queries on very large data sets.…”
Section: Related Workmentioning
confidence: 99%
“…Furthermore, MapReduce is accompanied by a plethora of free tools as well as having cluster availability and support. Hive [11], Pig [37], Scope [20], and HadoopDB [10,38] are projects that provide SQL abstractions on top of MapReduce platform to familiarize the programmers with complex queries. SQL/MapReduce [39] and Greenplum [21] …”
Section: Related Workmentioning
confidence: 99%
“…For instance, HadoopDB [1] (which forms the basis of its commercial version, Hadapt) uses relational databases to perform MapReduce tasks. Microsoft PolyBase [4] improves the scalability of SQL Server through "split query processing" [2], which transforms queries into MapReduce jobs. Sailfish [20] accelerates MapReduce by batching disk I/Os.…”
Section: Related Workmentioning
confidence: 99%