2002
DOI: 10.1007/3-540-45681-3_2
|View full text |Cite
|
Sign up to set email alerts
|

Fast Outlier Detection in High Dimensional Spaces

Abstract: In this paper we propose a new definition of distance-based outlier that considers for each point the sum of the distances from its k nearest neighbors, called weight. Outliers are those points having the largest values of weight. In order to compute these weights, we find the k nearest neighbors of each point in a fast and efficient way by linearizing the search space through the Hilbert space filling curve. The algorithm consists of two phases, the first provides an approximated solution, within a small fact… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
339
0
5

Year Published

2008
2008
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 602 publications
(367 citation statements)
references
References 14 publications
0
339
0
5
Order By: Relevance
“…In other words, distance can be seen as a radius around p, and if the percentage of points within this radius is smaller than (1 − perc), p is declared anomalous. A yet another definition, [2], assigns a weight to each point p, which is defined as the sum of the k-distance of all points within the k-distance neighborhood of p. Outliers are those points that have the biggest weight. There are some small differences between the three definitions given above.…”
Section: Outlier Score For Single Recordsmentioning
confidence: 99%
See 3 more Smart Citations
“…In other words, distance can be seen as a radius around p, and if the percentage of points within this radius is smaller than (1 − perc), p is declared anomalous. A yet another definition, [2], assigns a weight to each point p, which is defined as the sum of the k-distance of all points within the k-distance neighborhood of p. Outliers are those points that have the biggest weight. There are some small differences between the three definitions given above.…”
Section: Outlier Score For Single Recordsmentioning
confidence: 99%
“…For the definition by [8] it may be hard to set the parameters appropriately. The definition of [2] overcomes these problems, but is computationally expensive. We used in our experiments this later definition of the scoring function.…”
Section: Outlier Score For Single Recordsmentioning
confidence: 99%
See 2 more Smart Citations
“…Such approaches do not work well in even moderately high dimensional spaces and finding the right model is often a difficult task in its own right. To overcome these limitations, researchers have turned to various non-parametric approaches that use a point's distance to its nearest neighbor as a measure of unusualness [1,10,11]. The following (among others [3]) is a popular definition of distance-based outliers:…”
Section: Introductionmentioning
confidence: 99%