2016
DOI: 10.1016/j.eswa.2016.03.008
|View full text |Cite
|
Sign up to set email alerts
|

DENDIS: A new density-based sampling for clustering algorithm

Abstract: a b s t r a c tTo deal with large datasets, sampling can be used as a preprocessing step for clustering. In this paper, an hybrid sampling algorithm is proposed. It is density-based while managing distance concepts to ensure space coverage and fit cluster shapes. At each step a new item is added to the sample: it is chosen as the furthest from the representative in the most important group. A constraint on the hyper volume induced by the samples avoids over sampling in high density areas. The inner structure a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
7
3

Relationship

2
8

Authors

Journals

citations
Cited by 35 publications
(14 citation statements)
references
References 32 publications
(16 reference statements)
0
13
0
1
Order By: Relevance
“…Several techniques are available for local density estimation based on kernel or neighborhood, either the number of nearest neighbors or the number of neighbors within a given hyper-volume. They are not studied in the work, the reader may refer to Ros & Guillaume (2016, 2017 for a recent survey.…”
Section: Local Density Estimation and Noise Labelingmentioning
confidence: 99%
“…Several techniques are available for local density estimation based on kernel or neighborhood, either the number of nearest neighbors or the number of neighbors within a given hyper-volume. They are not studied in the work, the reader may refer to Ros & Guillaume (2016, 2017 for a recent survey.…”
Section: Local Density Estimation and Noise Labelingmentioning
confidence: 99%
“…A hashing function is used when doing a biased sampling to map bins in space to a linear ordering. An incremental algorithm is introduced by Frédéric Ros et al [13] to combine distance and density concepts. They manage distance concepts in order to make sure space coverage and fit cluster shapes by selecting representative points in every cluster.…”
Section: Related Workmentioning
confidence: 99%
“…Literature reports many clustering paradigms among which the most important can be categorized into Partitional Clustering [1][2][3][4][5] and its variants [6,7], Hierarchical Clustering [1,3,[8][9][10][11][12], Density-based Clustering [1,3,[13][14][15][16], Grid-based Clustering [1,3,[17][18][19][20], Spectral Clustering [1,3,[21][22][23][24], and Gravitational Clustering [25][26][27][28][29]. The literature review on clustering is given in the in the next section.…”
Section: Introductionmentioning
confidence: 99%