2007
DOI: 10.1109/ssdbm.2007.17
|View full text |Cite
|
Sign up to set email alerts
|

MAMCost: Global and Local Estimates leading to Robust Cost Estimation of Similarity Queries

Abstract: This paper presents an effective cost model to estimate the number of disk accesses (I/O cost) and the number of distance calculations (CPU cost) to process similarity queries over data indexed by metric access methods. Two types of similarity queries were taken into consideration: range and k-nearest neighbor queries. The main point of the cost model is considering not only global parameters of the data set but also the local data distribution. The model takes advantage of the intrinsic dimension of the data … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2007
2007
2017
2017

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(15 citation statements)
references
References 15 publications
0
15
0
Order By: Relevance
“…Baioco el al. [3] present a cost model to process similarity queries over data indexed by metric access methods called MAMCost. This cost is estimated by a histogram of the density of elements in the metric space.…”
Section: Previous Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Baioco el al. [3] present a cost model to process similarity queries over data indexed by metric access methods called MAMCost. This cost is estimated by a histogram of the density of elements in the metric space.…”
Section: Previous Workmentioning
confidence: 99%
“…The proposed framework were incorporated into SIREN to allow its to optimize similarity queries using the similarity algebra proposed by Traina et al [6] and Barioni et al [4] and the cost model proposed by Baioco et al [3].…”
Section: Previous Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Accordingly, existing k-NN models estimate a threshold as the query radius by using the pairwise distance distribution within S. Such estimates follow a biased assumption: query elements are more likely posed in high-density areas of the search space. For instance, the model for multidimensional spaces in [Aly et al 2015] assumes the k-NN radii follow a uniform distribution regarding fixed intervals of k, while the models in [Ciaccia et al 1998] and [Baioco et al 2007] assume k-NN radii follow a binomial and an exponential distribution, respectively. The main drawback of such models is they disregard the 'locality' of each query, i.e.…”
Section: Introductionmentioning
confidence: 99%
“…Also, ranking query plans do not need to materialize a query, making the query plan ranking much more efficient than the traditional ones, which can be prohibitively expensive. The work of Baioco et al [2007] presented a selectivity and cost model for similarity queries in the Slim-tree. A cost model to integrate multiple similarity-based image joins in a multimedia database using the R-tree index family was presented in Kosch [2010].…”
Section: Cost and Condition Selectivity Modelmentioning
confidence: 99%