2022
DOI: 10.1109/tkde.2020.3006446
|View full text |Cite
|
Sign up to set email alerts
|

Exploiting Data Skew for Improved Query Performance

Abstract: Analytic queries enable sophisticated large-scale data analysis within many commercial, scientific and medical domains today. Data skew is a ubiquitous feature of these real-world domains. In a retail database, some products are typically much more popular than others. In a text database, word frequencies follow a Zipf distribution with a small number of very common words, and a long tail of infrequent words. In a geographic database, some regions have much higher populations (and data measurements) than other… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(1 citation statement)
references
References 66 publications
0
1
0
Order By: Relevance
“…Their FastJoin system improved the performance in terms of latency and throughput. Zhang and Ross [12] presented an index structure to reorder data so that popular items were concentrated in the cache hierarchy. They analyzed the cache behavior and efficiently processed database queries in the presence of skew.…”
Section: Introductionmentioning
confidence: 99%
“…Their FastJoin system improved the performance in terms of latency and throughput. Zhang and Ross [12] presented an index structure to reorder data so that popular items were concentrated in the cache hierarchy. They analyzed the cache behavior and efficiently processed database queries in the presence of skew.…”
Section: Introductionmentioning
confidence: 99%