A novel probabilistic pruning approach to speed up similarity queries in uncertain databases

Bernecker, Thomas; Emrich, Tobias; Kriegel, Hans‐Peter; Mamoulis, Nikos; Renz, Matthias; Züfle, Andreas

doi:10.1109/icde.2011.5767908

Cited by 36 publications

(21 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The idea is to derive for each database object O, a lower and an upper bound of the probability that O has a higher score than Q. Using these approximations, we can apply the concept of uncertain generating functions [19] in order to obtain an (initial) approximated result of a PIR query, which guarantees that the true result is bounded correctly. The problem at hand is to update these uncertain generating functions efficiently when an update is fetched from the stream.…”

Section: Discussionmentioning

confidence: 99%

Continuous Inverse Ranking Queries in Uncertain Streams

Bernecker

Kriegel

Mamoulis

et al. 2011

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

Abstract. This paper introduces a scalable approach for continuous inverse ranking on uncertain streams. An uncertain stream is a stream of object instances with confidences, e.g. observed positions of moving objects derived from a sensor. The confidence value assigned to each instance reflects the likelihood that the instance conforms with the current true object state. The inverse ranking query retrieves the rank of a given query object according to a given score function. In this paper we present a framework that is able to update the query result very efficiently, as the stream provides new observations of the objects. We will theoretically and experimentally show that the query update can be performed in linear time complexity. We conduct an experimental evaluation on synthetic and real-world data, which demonstrates the efficiency of our approach.

show abstract

Section: Discussionmentioning

confidence: 99%

Continuous Inverse Ranking Queries in Uncertain Streams

Bernecker

Kriegel

Mamoulis

et al. 2011

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

show abstract

“…As shown in [5], the computational complexity is linear in k, yielding a total of O(|DB| 2 × k) for the probabilistic pruning. Verification: The verification step can be performed analogously to the k = 1 case using the algorithm proposed in [6], which has been designed for k ≥ 1.…”

Section: Probabilistic Rknn Queriesmentioning

confidence: 99%

“…In the first experiment (cf. Figure 6(a)), we varied the parameter k. It can be observed that the runtime scales slightly worse than linearly, which can be explained by the usage of uncertain generating functions that show a complexity of O(k 2 ) ( [5]). This is notable, since naive approaches need to consider all N k possible results.…”

Section: Cpu-costmentioning

confidence: 99%

“…Analogous to the k = 1 case, we can derive the following bounds for the probability P (A ≺Q B) that A ∈ D \ B is closer to B than Q: a lower bound PLB(A ≺Q B) (using Lemma 2) and an upper bound PUB(A ≺Q B) (using Lemma 3). Given these bounds, we can apply the concept of uncertain generating functions [5] in order to compute for each 0 ≤ j < k a lower bound PLB(#P runers = j) and an upper bound PUB(#P runers = j) of the probability of the random event that for exactly j uncertain objects Ai, Ai ≺Q B is true. A summary of the uncertain generating functions technique is given in Appendix C. Bounds for the probability of the event RkN NQ(Ui) that Ui is a RkNN of Q can then be derived as follows:…”

Section: Probabilistic Rknn Queriesmentioning

confidence: 99%

See 1 more Smart Citation

Efficient probabilistic reverse nearest neighbor query processing on uncertain data

et al. 2011

Self Cite

View full text Add to dashboard Cite

Given a query object q, a reverse nearest neighbor (RNN) query in a common certain database returns the objects having q as their nearest neighbor. A new challenge for databases is dealing with uncertain objects. In this paper we consider probabilistic reverse nearest neighbor (PRNN) queries, which return the uncertain objects having the query object as nearest neighbor with a sufficiently high probability. We propose an algorithm for efficiently answering PRNN queries using new pruning mechanisms taking distance dependencies into account. We compare our algorithm to state-ofthe-art approaches recently proposed. Our experimental evaluation shows that our approach is able to significantly outperform previous approaches. In addition, we show how our approach can easily be extended to PRkNN (where k > 1) query processing for which there is currently no efficient solution.

show abstract

“…Since expensive integration operations are involved in this step, a number of efficient methods have been proposed. In [11], [36], efficient methods were proposed to generate answer objects' probability bounds without performing expensive integration operations.…”

Section: • Formentioning

confidence: 99%

Voronoi-based nearest neighbor search for multi-dimensional uncertain databases

Zhang¹

View full text Add to dashboard Cite

Abstract-In Voronoi-based nearest neighbor search, the Voronoi cell of every point p in a database can be used to check whether p is the closest to some query point q. We extend the notion of Voronoi cells to support uncertain objects, whose attribute values are inexact. Particularly, we propose the Possible Voronoi cell (or PV-cell). A PV-cell of a multi-dimensional uncertain object o is a region R, such that for any point p ∈ R, o may be the nearest neighbor of p. If the PV-cells of all objects in a database S are known, they can be used to identify objects that have a chance to be the nearest neighbor of q.However, there is no efficient algorithm for computing an exact PV-cell. We hence study how to derive an axis-parallel hyper-rectangle (called the Uncertain Bounding Rectangle, or UBR) that tightly contains a PV-cell. We further develop the PV-index, a structure that stores UBRs, to evaluate probabilistic nearest neighbor queries over uncertain data. An advantage of the PVindex is that upon updates on S, it can be incrementally updated. Extensive experiments on both synthetic and real datasets are carried out to validate the performance of the PV-index.

show abstract

A novel probabilistic pruning approach to speed up similarity queries in uncertain databases

Cited by 36 publications

References 28 publications

Continuous Inverse Ranking Queries in Uncertain Streams

Continuous Inverse Ranking Queries in Uncertain Streams

Efficient probabilistic reverse nearest neighbor query processing on uncertain data

Voronoi-based nearest neighbor search for multi-dimensional uncertain databases

Contact Info

Product

Resources

About