Ryan Marcus scite author profile

Join order selection plays a significant role in query performance. However, modern query optimizers typically employ static join enumeration algorithms that do not receive any feedback about the quality of the resulting plan. Hence, optimizers often repeatedly choose the same bad plan, as they do not have a mechanism for "learning from their mistakes". In this paper, we argue that existing deep reinforcement learning techniques can be applied to address this challenge. These techniques, powered by artificial neural networks, can automatically improve decision making by incorporating feedback from their successes and failures. Towards this goal, we present ReJOIN, a proof-of-concept join enumerator, and present preliminary results indicating that ReJOIN can match or outperform the PostgreSQL optimizer in terms of plan quality and join enumeration efficiency.

show abstract

Benchmarking learned indexes

Marcus

Kipf²,

Renen³

et al. 2020

Proc. VLDB Endow.

View full text Add to dashboard Cite

Recent advancements in learned index structures propose replacing existing index structures, like B-Trees, with approximate learned models. In this work, we present a unified benchmark that compares well-tuned implementations of three learned index structures against several state-of-the-art "traditional" baselines. Using four real-world datasets, we demonstrate that learned index structures can indeed outperform non-learned indexes in read-only in-memory workloads over a dense array. We investigate the impact of caching, pipelining, dataset size, and key size. We study the performance profile of learned index structures, and build an explanation for why learned models achieve such good performance. Finally, we investigate other important properties of learned index structures, such as their performance in multi-threaded systems and their build times.

show abstract

Plan-structured deep neural network models for query performance prediction

Marcus

Papaemmanouil

2019

Proc. VLDB Endow.

View full text Add to dashboard Cite

Query performance prediction, the task of predicting the latency of a query, is one of the most challenging problem in database management systems. Existing approaches rely on features and performance models engineered by human experts, but often fail to capture the complex interactions between query operators and input relations, and generally do not adapt naturally to workload characteristics and patterns in query execution plans. In this paper, we argue that deep learning can be applied to the query performance prediction problem, and we introduce a novel neural network architecture for the task: a plan-structured neural network. Our approach eliminates the need for human-crafted feature selection and automatically discovers complex performance models both at the operator and query plan level. Our novel neural network architecture can match the structure of any optimizer-selected query execution plan and predict its latency with high accuracy. We also propose a number of optimizations that reduce training overhead without sacrificing effectiveness. We evaluated our techniques on various workloads and we demonstrate that our plan-structured neural network can outperform the state-of-the-art in query performance prediction.

show abstract

Bao: Making Learned Query Optimization Practical

Marcus

Negi

Mao

et al. 2021

View full text Add to dashboard Cite

Neo

et al. 2019

View full text Add to dashboard Cite

Query optimization is one of the most challenging problems in database systems. Despite the progress made over the past decades, query optimizers remain extremely complex components that require a great deal of hand-tuning for specific workloads and datasets. Motivated by this shortcoming and inspired by recent advances in applying machine learning to data management challenges, we introduce Neo ( Neural Optimizer ), a novel learning-based query optimizer that relies on deep neural networks to generate query executions plans. Neo bootstraps its query optimization model from existing optimizers and continues to learn from incoming queries, building upon its successes and learning from its failures. Furthermore, Neo naturally adapts to underlying data patterns and is robust to estimation errors. Experimental results demonstrate that Neo, even when bootstrapped from a simple optimizer like PostgreSQL, can learn a model that offers similar performance to state-of-the-art commercial optimizers, and in some cases even surpass them.

show abstract

CDFShop: Exploring and Optimizing Learned Index Structures

Marcus

Zhang

Kraska

2020

View full text Add to dashboard Cite

Indexes are a critical component of data management applications. While tree-like structures (e.g., B-Trees) have been employed to great success, recent work suggests that index structures powered by machine learning models (learned index structures) can achieve low lookup times with a reduced memory footprint. This demonstration showcases CDFShop, a tool to explore and optimize recursive model indexes (RMIs), a type of learned index structure. This demonstration allows audience members to (1) gain an intuition about various tuning parameters of RMIs and why learned index structures can greatly accelerate search, and (2) understand how automatic optimization techniques can be used to explore space/time tradeoffs within the space of RMIs.

show abstract

Cost-Guided Cardinality Estimation: Focus Where it Matters

Negi¹,

Marcus²,

Mao³

et al. 2020

View full text Add to dashboard Cite

Steering Query Optimizers: A Practical Take on Big Data Workloads

Negi

Interlandi

Marcus

et al. 2021

View full text Add to dashboard Cite

12 3 4 5 6

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ryan Marcus

Deep Reinforcement Learning for Join Order Enumeration

Benchmarking learned indexes

Plan-structured deep neural network models for query performance prediction

Bao: Making Learned Query Optimization Practical

Neo

CDFShop: Exploring and Optimizing Learned Index Structures

Cost-Guided Cardinality Estimation: Focus Where it Matters

Steering Query Optimizers: A Practical Take on Big Data Workloads

Contact Info

Product

Resources

About