2019
DOI: 10.48550/arxiv.1908.09946
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

An empirical comparison between stochastic and deterministic centroid initialisation for K-Means variations

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
4
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(7 citation statements)
references
References 0 publications
0
4
0
Order By: Relevance
“…We have also showed that, similar to unsupervised methods [15], semi-supervised algorithms can be affected by initialisation procedures and by the type of used constraints. From our results we observed that using only MUST-LINK constraints has a negative effect on the semi-supervised algorithms (however PCSKM could cope with the MUST-LINK constraints much better than MPCKM and PCKM).…”
Section: Discussionmentioning
confidence: 83%
See 3 more Smart Citations
“…We have also showed that, similar to unsupervised methods [15], semi-supervised algorithms can be affected by initialisation procedures and by the type of used constraints. From our results we observed that using only MUST-LINK constraints has a negative effect on the semi-supervised algorithms (however PCSKM could cope with the MUST-LINK constraints much better than MPCKM and PCKM).…”
Section: Discussionmentioning
confidence: 83%
“…Our benchmark includes the real world data sets fisheriris and ionosphere from the UCI repository [18] which have unknown feature quality. We also added two synthetic data sets that were generated based on the generator of [17] which we used in our previous study [15]. These synthetic data sets are consisting of informative and uninformative features without any noise injection.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…We name this algorithm Pairwise Constrained Sparse K-Means (PCSKM) and we test its performance under different conditions such as different number and kind of constraints (CANNOT-LINK, MUST-LINK or both). In our previous study [14] we have shown that the deterministic initialisation method of Density K-Means++ (DKM++) [15] surpasses the average performance of stochastic methods thus we select this method for the initialisation of the algorithms along with the Seeding method proposed in the study of [16]. We have also included the initialisation methods of ROBIN [17] and Maximin [18] to strengthen our conclusions (results in appendix).…”
Section: Introductionmentioning
confidence: 95%