Guido Moerkotte scite author profile

Accurate cardinality estimates are essential for a successful query optimization. This is not only true for relational DBMSs but also for RDF stores. An RDF database consists of a set of triples and, hence, can be seen as a relational database with a single table with three attributes. This makes RDF rather special in that queries typically contain many self joins.We show that relational DBMSs are not well-prepared to perform cardinality estimation in this context. Further, there are hardly any special cardinality estimation methods for RDF databases. To overcome this lack of appropriate cardinality estimation methods, we introduce characteristic sets together with new cardinality estimation methods based upon them. We then show experimentally that the new methods are-in the RDF context-highly superior to the estimation methods employed by commercial DBMSs and by the open-source RDF store RDF-3X.

show abstract

Heuristic and randomized optimization for the join ordering problem

Steinbrunn

Moerkotte

Kemper

1997

The VLDB Journal The International Journal on Very Large Data B

276

189

View full text Add to dashboard Cite

Preventing bad plans by bounding the impact of cardinality estimation errors

2009

View full text Add to dashboard Cite

Query optimizers rely on accurate estimations of the sizes of intermediate results. Wrong size estimations can lead to overly expensive execution plans. We first define the q-error to measure deviations of size estimates from actual sizes. The q-error enables the derivation of two important results: (1) We provide bounds such that if the q-error is smaller than this bound, the query optimizer constructs an optimal plan. (2) If the q-error is bounded by a number q, we show that the cost of the produced plan is at most a factor of q 4 worse than the optimal plan. Motivated by these findings, we next show how to find the best approximation under the q-error. These techniques can then be used to build synopsis for size estimates. Finally, we give some experimental results where we apply the developed techniques.

show abstract

Anatomy of a native XML base management system

Fiebig

Helmer

Kanne³

et al. 2002

The VLDB Journal The International Journal on Very Large Data B

134

View full text Add to dashboard Cite

Several alternatives to manage large XML document collections exist, ranging from file systems over relational or other database systems to specifically tailored XML repositories. In this paper we give a tour of Natix, a database management system designed from scratch for storing and processing XML data. Contrary to the common belief that management of XML data is just another application for traditional databases like relational systems, we illustrate how almost every component in a database system is affected in terms of adequacy and performance. We show how to design and optimize areas such as storage, transaction management comprising recovery and multiuser synchronization as well as query processing for XML.

show abstract

The implementation and performance of compressed databases

et al. 2000

View full text Add to dashboard Cite

In this paper, we show how compression can be integrated into a relational database system. Specifically, we describe how the storage manager, the query execution engine, and the query optimizer of a database system can be extended to deal with compressed data. Our main result is that compression can significantly improve the response time of queries if very light-weight compression techniques are used. We will present such light-weight compression techniques and give the results of running the TPC-D benchmark on a so compressed database and a non-compressed database using the AODB database system, an experimental database system that was developed at the Universities of Mannheim and Passau. Our benchmark results demonstrate that compression indeed offers high performance gains (up to 50%) for IO-intensive queries and moderate gains for CPU-intensive queries. Compression can, however, also increase the running time of certain update operations. In all, we recommend to extend today's database systems with light-weight compression techniques and to make extensive use of this feature.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Guido Moerkotte

Characteristic sets: Accurate cardinality estimation for RDF queries with multiple joins

Heuristic and randomized optimization for the join ordering problem

Preventing bad plans by bounding the impact of cardinality estimation errors

Anatomy of a native XML base management system

The implementation and performance of compressed databases

Contact Info

Product

Resources

About