2019
DOI: 10.1109/tpami.2017.2780166
|View full text |Cite
|
Sign up to set email alerts
|

Kernel Clustering: Density Biases and Solutions

Abstract: Abstract-Kernel methods are popular in clustering due to their generality and discriminating power. However, we show that many kernel clustering criteria have density biases theoretically explaining some practically significant artifacts empirically observed in the past. For example, we provide conditions and formally prove the density mode isolation bias in kernel K-means for a common class of kernels. We call it Breiman's bias due to its similarity to the histogram mode isolation previously discovered by Bre… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
18
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 50 publications
(25 citation statements)
references
References 41 publications
(117 reference statements)
2
18
0
Order By: Relevance
“…We recall here that the equivalence was proved assuming the clusters are balanced. Also, notice that MI-ADM and DEPICT have approximately the same performance, confirming our earlier discussion: MI-ADM in (13) can be viewed as an approximation of DEPICT in (6). The additional entropy term in (13), H(Q), has almost no effect of the results.…”
Section: Evaluation Of Clustering Algorithmssupporting
confidence: 84%
See 2 more Smart Citations
“…We recall here that the equivalence was proved assuming the clusters are balanced. Also, notice that MI-ADM and DEPICT have approximately the same performance, confirming our earlier discussion: MI-ADM in (13) can be viewed as an approximation of DEPICT in (6). The additional entropy term in (13), H(Q), has almost no effect of the results.…”
Section: Evaluation Of Clustering Algorithmssupporting
confidence: 84%
“…Also, notice that MI-ADM and DEPICT have approximately the same performance, confirming our earlier discussion: MI-ADM in (13) can be viewed as an approximation of DEPICT in (6). The additional entropy term in (13), H(Q), has almost no effect of the results. Finally, notice the substantial difference in performance (11%) between our regularized and soft Kmeans and DCN [3], which is based on a hard K-means loss.…”
Section: Evaluation Of Clustering Algorithmssupporting
confidence: 84%
See 1 more Smart Citation
“…Kernel method maps a nonlinearly separable dataset into a higher-dimensional Hilbert space, and in the Hilbert space the dataset may be linearly separable. DBK clustering [15] proposes a density equalization principle, and then based on this principle, they propose an adaptive kernel clustering algorithm. Multiple kernels clustering algorithms [16][17][18][19] use multiple kernel functions to enhance the performance of kernel clustering algorithms.…”
Section: Introductionmentioning
confidence: 99%
“…In the experiments, we discard some extreme partitions from M partitions with Ncut's mode isolation (Breiman's bias[76]), namely, partitions possess clusters that are composed of few objects.…”
mentioning
confidence: 99%