2023
DOI: 10.1145/3589263
|View full text |Cite
|
Sign up to set email alerts
|

BtrBlocks: Efficient Columnar Compression for Data Lakes

Abstract: Analytics is moving to the cloud and data is moving into data lakes. These reside on object storage services like S3 and enable seamless data sharing and system interoperability. To support this, many systems build on open storage formats like Apache Parquet. However, these formats are not optimized for remotely-accessed data lakes and today's high-throughput networks. Inefficient decompression makes scans CPU-bound and thus increases query time and cost. With this work we present BtrBlocks, an open columnar s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
references
References 32 publications
(36 reference statements)
0
0
0
Order By: Relevance