2010
DOI: 10.1109/tkde.2010.78
|View full text |Cite
|
Sign up to set email alerts
|

Scalable Probabilistic Similarity Ranking in Uncertain Databases

Abstract: Abstract-This paper introduces a scalable approach for probabilistic top-k similarity ranking on uncertain vector data. Each uncertain object is represented by a set of vector instances that are assumed to be mutually-exclusive. The objective is to rank the uncertain data according to their distance to a reference object. We propose a framework that incrementally computes for each object instance and ranking position, the probability of the object falling at that ranking position. The resulting rank probabilit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
24
0

Year Published

2011
2011
2020
2020

Publication Types

Select...
5
4

Relationship

2
7

Authors

Journals

citations
Cited by 34 publications
(24 citation statements)
references
References 28 publications
0
24
0
Order By: Relevance
“…For the verification step, we perform, for each remaining candidate B, a probabilistic nearest neighbor query using the algorithm proposed in [6] for probabilistic ranking queries (and setting k = 1). This algorithm takes Q, B and D \ B (in particular this set can be reduced, as shown in Appendix D) as input and returns P (N NB(Q)) which is equivalent to P (RN NQ(B)).…”
Section: Verificationmentioning
confidence: 99%
“…For the verification step, we perform, for each remaining candidate B, a probabilistic nearest neighbor query using the algorithm proposed in [6] for probabilistic ranking queries (and setting k = 1). This algorithm takes Q, B and D \ B (in particular this set can be reduced, as shown in Appendix D) as input and returns P (N NB(Q)) which is equivalent to P (RN NQ(B)).…”
Section: Verificationmentioning
confidence: 99%
“…These probabilities can be computed in a single database scan. We can process the p t j successively by means of the Poisson binomial recurrence [17], as proposed in [18]. Therefore, let P t i,j be the probability that, out of the j objects processed so far, exactly i objects have a higher score than q.…”
Section: Initial Computationmentioning
confidence: 99%
“…At the same time, uncertainty is inherent in many datasets due to various factors like noise [1], privacy protection strategy [2], incompleteness of data and delay or loss in data transfer [3]. In this paper, we connect top k query and uncertain data model, and propose a novel top (k 1 , k 2 ) query in uncertain datasets.…”
Section: Introductionmentioning
confidence: 99%