1989
DOI: 10.1145/76902.76907
|View full text |Cite
|
Sign up to set email alerts
|

On the effect of join operations on relation sizes

Abstract: We propose a generating function approach to the problem of evaluating the sizes of derived relations in a relational database framework. We present a model of relations and show how to use it to deduce probabilistic estimations of derived relation sizes. These are found to asymptotically follow normal distributions under a variety of assumptions.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

1992
1992
2020
2020

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 27 publications
(4 citation statements)
references
References 18 publications
(13 reference statements)
0
4
0
Order By: Relevance
“…This formula assumes that the attributes values are uniformly distributed. For the attribute values of the skew distribution, Grady [27] and Haas [28] gave corresponding estimation formulas.…”
Section: B Sequential Optimizationmentioning
confidence: 99%
“…This formula assumes that the attributes values are uniformly distributed. For the attribute values of the skew distribution, Grady [27] and Haas [28] gave corresponding estimation formulas.…”
Section: B Sequential Optimizationmentioning
confidence: 99%
“…Selectivity factors are statistical values stored in the data dictionary of the DBMS. Many research efforts tackled the problem of join size estimation [34][35][36]; Mannino et al give a survey on the suggested statistical values to store, how to maintain them, and how to use them to predict the result sizes of various database operations [37]. Most projects make the same simplifying assumptions as we do: uniformity of attribute values and independence of attribute values [38].…”
Section: Related Work On Completenessmentioning
confidence: 99%
“…The number of I/Os is a function of the access plan (e.g., nested loop join, merge join, hybrid join) chosen by the query optimizer of the DBMS to perform the select, of the existence of indexes and type of access method (e.g., b-tree, hashing) used in each table, the buffer size and buffer management policies (e.g., LRU), and of parameters such as page sizes, data and index page fill factors, and others. For relevant previous work on access plans, join processing, and query optimization see [2], [5], [29], [37], [38], [42], [44], [45]. Some of these papers describe the operation of access plans and others concentrate on estimating the resulting size of joins between relations, for various types of joins.…”
Section: Step 8-b: Db Modelingmentioning
confidence: 99%