2016
DOI: 10.14778/3025111.3025123
|View full text |Cite
|
Sign up to set email alerts
|

Skipping-oriented partitioning for columnar layouts

Abstract: As data volumes continue to grow, modern database systems increasingly rely on data skipping mechanisms to improve performance by avoiding access to irrelevant data. Recent work [39] proposed a fine-grained partitioning scheme that was shown to improve the opportunities for data skipping in row-oriented systems. Modern analytics and big data systems increasingly adopt columnar storage schemes, and in such systems, a row-based approach misses important opportunities for further improving data skipping. The flex… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
18
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 41 publications
(18 citation statements)
references
References 33 publications
0
18
0
Order By: Relevance
“…The results showed 3‐7x improvements in the query response time compared to the traditional range partitioning. In their latest work, Sun et al presented a novel hybrid data skipping framework that optimizes the overall query performance by automatically balancing skipping effectiveness and tuple‐reconstruction overhead. It allows both horizontal and vertical partitioning of the data, which maximizes the overall query performance.…”
Section: Background and Related Workmentioning
confidence: 99%
“…The results showed 3‐7x improvements in the query response time compared to the traditional range partitioning. In their latest work, Sun et al presented a novel hybrid data skipping framework that optimizes the overall query performance by automatically balancing skipping effectiveness and tuple‐reconstruction overhead. It allows both horizontal and vertical partitioning of the data, which maximizes the overall query performance.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Partitioning a relation is NP-hard [72]. Data partitioning covers both the problem of partitioning a relation across multiple servers and within a single server [63,79,80]. Partitioning across both rows and columns is introduced by several systems to account for different read access patterns (e.g., on fact tables and dimension tables) [4,11,26].…”
Section: Related Workmentioning
confidence: 99%
“…In [17,18], different partitioning approaches are presented, which help in selective queries. In [17], data is divided into multiple horizontal partitions and in each partition, data is stored row-wise, rather than column-wise.…”
Section: Related Workmentioning
confidence: 99%
“…This eventually gives a feature-vector for every tuple, which is then used for filtering partitions. A similar vector is also used in [18], however this time it utilizes hybrid layouts with column grouping, instead of fixed row layouts. The latter helps for both selection and projection queries.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation