1993
DOI: 10.1145/169725.169708
|View full text |Cite
|
Sign up to set email alerts
|

Optimal histograms for limiting worst-case error propagation in the size of join results

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
60
0

Year Published

1996
1996
2009
2009

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 94 publications
(62 citation statements)
references
References 11 publications
1
60
0
Order By: Relevance
“…The first results that led towards new types of histograms were derived in an effort to obtain statistics that would be optimal in minimizing/containing the propagation of errors in the size of join results [37]. The basic mathematical tools used were borrowed from majorization theory [55].…”
Section: Optimal Sort Parametermentioning
confidence: 99%
“…The first results that led towards new types of histograms were derived in an effort to obtain statistics that would be optimal in minimizing/containing the propagation of errors in the size of join results [37]. The basic mathematical tools used were borrowed from majorization theory [55].…”
Section: Optimal Sort Parametermentioning
confidence: 99%
“…-V-optimal [8,7,9]: Partition data such that β j=1 nj k=1 (f j − f j,k ) 2 is minimized, where β is the number of buckets, n j is the number of entries in the jth bucket, f j is the average frequency of jth bucket, and f j,k is the kth frequency of jth bucket.…”
Section: Existing Histogram Techniquesmentioning
confidence: 99%
“…It has been shown [12] that this technique out-performed those "conventional" histogram techniques [7,8,9,13,17]. To compliment the work in [12], in this paper we will propose a novel optimization model for generating linear-spline based histograms.…”
Section: Introductionmentioning
confidence: 96%
See 1 more Smart Citation
“…Histogram H1 in the earlier table is not serial as frequencies 1 and 3 appear in one bucket and frequency 2 appears in the other, while histogram H2 is. Under various optimality criteria, serial histograms have been shown to be optimal for reducing the worst-case and the average error in equality selection and join queries IC93,Ioa93,IP95 . Identifying the optimal histogram among all serial ones takes exponential time in the number of buckets. Moreover, since there is usually no order-correlation between attribute values and their frequencies, storage of serial histograms essentially requires a regular index that will lead to the approximate frequency of every individual attribute value.…”
Section: Size-distribution Estimatormentioning
confidence: 99%