1977
DOI: 10.1145/359461.359475
|View full text |Cite
|
Sign up to set email alerts
|

Approximating block accesses in database organizations

Abstract: When data records are grouped into blocks in secondary storage, it is frequently necessary to estimate the number of blocks XD accessed for a given query. In a recent paper [Ij, Cardenas gave the expression = m{\ -(I - (1) assuming that there are n records divided into m blocks and that the k records satisfying the query are distributed uniformly among the m blocks. The derivation of the expression was left to the reader as an exercise.Let us take a closer look at the expression.(1 -1/m) gives the probability … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
95
0
1

Year Published

1985
1985
2013
2013

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 316 publications
(96 citation statements)
references
References 5 publications
0
95
0
1
Order By: Relevance
“…Next, let's preserve the clustering structure and distribute all documents randomly to these clusters. The average number of target clusters for this case is shown by n tr and its value can be calculated without creating random clusters by the modified form [Can, Ozkarahan, 1990] of Yao's formula [Yao, 1977]; however, for the validity decision we need the distribution of the n tr values. The case n t > n tr suggests that the tested clustering structure is invalid, since it is unsuccessful in placing the documents relevant to the same query into a fewer number of clusters than that of the average random case.…”
Section: Validation Of the Generated Clustering Structurementioning
confidence: 99%
“…Next, let's preserve the clustering structure and distribute all documents randomly to these clusters. The average number of target clusters for this case is shown by n tr and its value can be calculated without creating random clusters by the modified form [Can, Ozkarahan, 1990] of Yao's formula [Yao, 1977]; however, for the validity decision we need the distribution of the n tr values. The case n t > n tr suggests that the tested clustering structure is invalid, since it is unsuccessful in placing the documents relevant to the same query into a fewer number of clusters than that of the average random case.…”
Section: Validation Of the Generated Clustering Structurementioning
confidence: 99%
“…Cardenas [7], e.g., gives Equation 7 for to estimate the distinct accessed records when accessing one of R.n records r times. Whilst challenged repeatedly for special cases [13], [34], [9], we found the equation yields virtually identical results to the equation from the original cost model while being much cheaper to compute.…”
Section: Extensions To the Generic Cost Modelmentioning
confidence: 88%
“…This expression was shown by Palvia and March [6] to be an overall better estimator than the prevalently used approximation by Cardenas [2], and computationally more efficient than the exact expression by Yao [10].…”
Section: Sequential Filesmentioning
confidence: 88%