1994
DOI: 10.1145/198429.198435
|View full text |Cite
|
Sign up to set email alerts
|

Reservoir-sampling algorithms of time complexity O ( n (1 + log( N / n )))

Abstract: One-pass algorithms for sampling n records without replacement from a population of unknown size n are known as reservoir-sampling algorithms. In this article, Vitter's reservoir-sampling algorithm, algorithm Z, is modified to give a more efficient algorithm, algorithm K. Additionally, two new algorithms, algorithm L and algorithm M, are proposed. If the time for scanning the population is ignored, all the four algorithms have expected CPU time O … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0
1

Year Published

2001
2001
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 55 publications
(22 citation statements)
references
References 7 publications
0
21
0
1
Order By: Relevance
“…To reduce the cost, we have integrated the new model to a sampling-based tool -Suggestion of Locality Optimization (SLO), developed by Beyls and D'Hollander at Ghent University [4]. SLO uses reservoir sampling [14], which has two distinct properties. First, it keeps a bounded number of samples in reservoir, so the collection rate drops as a program execution lengthens.…”
Section: Sampling Analysismentioning
confidence: 99%
“…To reduce the cost, we have integrated the new model to a sampling-based tool -Suggestion of Locality Optimization (SLO), developed by Beyls and D'Hollander at Ghent University [4]. SLO uses reservoir sampling [14], which has two distinct properties. First, it keeps a bounded number of samples in reservoir, so the collection rate drops as a program execution lengthens.…”
Section: Sampling Analysismentioning
confidence: 99%
“…The first reservoir algorithms [26], [22] in the database context maintain an array RS with maximum size s (the target sample size), which is initially empty. The first s tuples are directly added into RS, after which the array becomes full.…”
Section: Reservoir Samplingmentioning
confidence: 99%
“…This technique ensures that at every moment, the reservoir contains a random sample of the elements seen so far. Vitter and subsequent researchers proposed efficient algorithms to simulate this procedure that instead of checking every element skip over a number of them (see, for example, [21,16]). …”
Section: Estimating the Sampling Probabilitymentioning
confidence: 99%