2018
DOI: 10.3390/a11020019
|View full text |Cite
|
Sign up to set email alerts
|

Common Nearest Neighbor Clustering—A Benchmark

Abstract: Cluster analyses are often conducted with the goal to characterize an underlying probability density, for which the data-point density serves as an estimate for this probability density. We here test and benchmark the common nearest neighbor (CNN) cluster algorithm. This algorithm assigns a spherical neighborhood R to each data point and estimates the data-point density between two data points as the number of data points N in the overlapping region of their neighborhoods (step 1). The main principle in the CN… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
44
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 26 publications
(49 citation statements)
references
References 54 publications
(92 reference statements)
1
44
0
Order By: Relevance
“…We then identified 22 core-sets in the space of the first six tICs using the CNN-cluster algorithm [39][40][41] , and used them to construct a core-set Markov model. The implied time-scale test shows that the timescales of our core-set Markov model are independent of the lag time τ indicating a very small discretization error and thus a high-quality Markov model (Fig.…”
Section: A)mentioning
confidence: 99%
“…We then identified 22 core-sets in the space of the first six tICs using the CNN-cluster algorithm [39][40][41] , and used them to construct a core-set Markov model. The implied time-scale test shows that the timescales of our core-set Markov model are independent of the lag time τ indicating a very small discretization error and thus a high-quality Markov model (Fig.…”
Section: A)mentioning
confidence: 99%
“…The work [27] presents the results of studies related to image scaling, which allows for image size reduction. It has been shown that the nearest neighbor method is often used to scale images [28]. However, the issues related to improving compression efficiency remained unresolved.…”
Section: Literature Review and Problem Statementmentioning
confidence: 99%
“…• R = 10-100% * average distance to data center [2] • R = Average pairwise distance of all data points [28] • R = 90% * first peak in the pairwise distance histogram [17] • R = 0.07 [26] • k = 10 [18] • k = 30 [12] • k = 10-100 [27] • k = 30-200 [5] • k = √N [19] • k = min{50, N/(2K)} where K is the number of clusters [this paper]…”
Section: Neighbor-based Distance-basedmentioning
confidence: 99%
“…In general, automatic choice of the parameter may appear simple in the eye of theoretician but is hardly so in the eye of practitioners [29]. As a consequence, some methods leave the choice to the user [17], or assume that brute force manual optimization is performed [22].…”
Section: Neighbor-based Distance-basedmentioning
confidence: 99%
See 1 more Smart Citation