Kernel Clustering: Density Biases and Solutions

Marin, Dmitrii; Tang, Meng; Ayed, Ismail Ben; Boykov, Yuri

doi:10.1109/tpami.2017.2780166

Cited by 50 publications

(25 citation statements)

References 41 publications

(117 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We recall here that the equivalence was proved assuming the clusters are balanced. Also, notice that MI-ADM and DEPICT have approximately the same performance, confirming our earlier discussion: MI-ADM in (13) can be viewed as an approximation of DEPICT in (6). The additional entropy term in (13), H(Q), has almost no effect of the results.…”

Section: Evaluation Of Clustering Algorithmssupporting

confidence: 84%

“…Also, notice that MI-ADM and DEPICT have approximately the same performance, confirming our earlier discussion: MI-ADM in (13) can be viewed as an approximation of DEPICT in (6). The additional entropy term in (13), H(Q), has almost no effect of the results. Finally, notice the substantial difference in performance (11%) between our regularized and soft Kmeans and DCN [3], which is based on a hard K-means loss.…”

Section: Evaluation Of Clustering Algorithmssupporting

confidence: 84%

“…NMI translates the similarity between pairs of clusters, and is invariant w.r.t permutations [26]. Table 1 reports the results 5 of discriminative model MI-ADM in (13) and generative model SR-K-means in (18). We also include the results of several related models: (1) the DEPICT model [4] based on KL and logistic regression posteriors, which achieves a state-of-the-art performance on MNIST;…”

Section: Performance Metricsmentioning

confidence: 99%

See 2 more Smart Citations

Deep Clustering: On the Link Between Discriminative Models and K-Means

Jabi

Pedersoli²,

Mitiche

et al. 2021

IEEE Trans. Pattern Anal. Mach. Intell.

Self Cite

102

View full text Add to dashboard Cite

In the context of recent deep clustering studies, discriminative models dominate the literature and report the most competitive performances. These models learn a deep discriminative neural network classifier in which the labels are latent. Typically, they use multinomial logistic regression posteriors and parameter regularization, as is very common in supervised learning. It is generally acknowledged that discriminative objective functions (e.g., those based on the mutual information or the KL divergence) are more flexible than generative approaches (e.g., K-means) in the sense that they make fewer assumptions about the data distributions and, typically, yield much better unsupervised deep learning results. On the surface, several recent discriminative models may seem unrelated to K-means. This study shows that these models are, in fact, equivalent to K-means under mild conditions and common posterior models and parameter regularization. We prove that, for the commonly used logistic regression posteriors, maximizing the L 2 regularized mutual information via an approximate alternating direction method (ADM) is equivalent to a soft and regularized K-means loss. Our theoretical analysis not only connects directly several recent state-of-the-art discriminative models to K-means, but also leads to a new soft and regularized deep K-means algorithm, which yields competitive performance on several image clustering benchmarks.

show abstract

Section: Evaluation Of Clustering Algorithmssupporting

confidence: 84%

Section: Evaluation Of Clustering Algorithmssupporting

confidence: 84%

Section: Performance Metricsmentioning

confidence: 99%

See 1 more Smart Citation

Deep Clustering: On the Link Between Discriminative Models and K-Means

Jabi

Pedersoli²,

Mitiche

et al. 2021

IEEE Trans. Pattern Anal. Mach. Intell.

Self Cite

102

View full text Add to dashboard Cite

show abstract

“…Kernel method maps a nonlinearly separable dataset into a higher-dimensional Hilbert space, and in the Hilbert space the dataset may be linearly separable. DBK clustering [15] proposes a density equalization principle, and then based on this principle, they propose an adaptive kernel clustering algorithm. Multiple kernels clustering algorithms [16][17][18][19] use multiple kernel functions to enhance the performance of kernel clustering algorithms.…”

Section: Introductionmentioning

confidence: 99%

Iterative Min Cut Clustering Based on Graph Cuts

Liu

et al. 2021

Sensors

View full text Add to dashboard Cite

Clustering nonlinearly separable datasets is always an important problem in unsupervised machine learning. Graph cut models provide good clustering results for nonlinearly separable datasets, but solving graph cut models is an NP hard problem. A novel graph-based clustering algorithm is proposed for nonlinearly separable datasets. The proposed method solves the min cut model by iteratively computing only one simple formula. Experimental results on synthetic and benchmark datasets indicate the potential of the proposed method, which is able to cluster nonlinearly separable datasets with less running time.

show abstract

“…In the experiments, we discard some extreme partitions from M partitions with Ncut's mode isolation (Breiman's bias[76]), namely, partitions possess clusters that are composed of few objects.…”

mentioning

confidence: 99%

An Internal Validity Index Based on Density-Involved Distance

Zhong

2019

IEEE Access

View full text Add to dashboard Cite

It is crucial to evaluate the quality of clustering results in cluster analysis. Although many cluster validity indices (CVIs) have been proposed in the literature, they have some limitations when dealing with non-spherical datasets. One reason is that the measure of cluster separation does not consider the impact of outliers and neighborhood clusters. In this paper, a new robust distance measure, one into which density is incorporated, is designed to solve the problem, and an internal validity index based on this separation measure is then proposed. This index can cope with both the spherical and non-spherical structure of clusters. The experimental results indicate that the proposed index outperforms some classical CVIs.INDEX TERMS Crisp clustering, cluster validity index, arbitrary-shaped clusters.

show abstract

Kernel Clustering: Density Biases and Solutions

Cited by 50 publications

References 41 publications

Deep Clustering: On the Link Between Discriminative Models and K-Means

Deep Clustering: On the Link Between Discriminative Models and K-Means

Iterative Min Cut Clustering Based on Graph Cuts

An Internal Validity Index Based on Density-Involved Distance

Contact Info

Product

Resources

About