2020
DOI: 10.48550/arxiv.2006.13282
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads

Abstract: Filtering data based on predicates is one of the most fundamental operations for any modern data warehouse. Techniques to accelerate the execution of filter expressions include clustered indexes, specialized sort orders (e.g., Z-order), multi-dimensional indexes, and, for high selectivity queries, secondary indexes. However, these schemes are hard to tune and their performance is inconsistent. Recent work on learned multi-dimensional indexes has introduced the idea of automatically optimizing an index for a pa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
2
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 30 publications
0
2
0
Order By: Relevance
“…Ma et al [42] used mixture density networks for AQP. Database indexing research recently has adopted neural networks to approximate cumulative density functions [9,10,30,49]. Query optimization and join ordering are also benefiting from neural networks [27,45].…”
Section: Related Work 61 Learned Database Systemsmentioning
confidence: 99%
See 1 more Smart Citation
“…Ma et al [42] used mixture density networks for AQP. Database indexing research recently has adopted neural networks to approximate cumulative density functions [9,10,30,49]. Query optimization and join ordering are also benefiting from neural networks [27,45].…”
Section: Related Work 61 Learned Database Systemsmentioning
confidence: 99%
“…Cardinality/selectivity estimation, has improved considerably leveraging ML [17,70,77,78,84]. Likewise for query optimization [27,44,45], indexes [9,10,30,49], cost estimation [63,83], workload forecasting [85], DB tuning [34,68,81], synthetic data generation [7,54,76], etc.…”
Section: Introductionmentioning
confidence: 99%
“…This increase in performance is what Kraska et al hoped to achieve when they first introduced their work on Learned Index Structure (LIS) models [21]. Even though the concept of a LIS is still new, it has already led to a surge of inspiring results that leverage ideas from Machine Learning (ML), data structures, and database systems [7], [6], [31], [15], [1], [32], [5], [30], [23], [16], [11], [19], [8], [13], [27].…”
Section: Introductionmentioning
confidence: 99%
“…ML techniques enable automatic, fine-grained, and more accurate characterization of the problem space and benefit a variety of tasks in DBMS. Specifically, unsupervised ML techniques can model the data distribution for cardinality estimation (CardEst) [14,39,41,42,46] and indexing [6,7,18,27]; supervised ML models can replace the cost estimator (CostEst) [25,34,35] and execution scheduler [23,31]; and reinforcement learning methods solve decision making # The second and third authors contribute equally to this paper. problems such as configuration tuning [1,20,44] and join order selection (JoinSel) [12,22,24,29,43].…”
Section: Introductionmentioning
confidence: 99%