2014
DOI: 10.1016/j.patcog.2014.03.006
|View full text |Cite
|
Sign up to set email alerts
|

Cross-entropy clustering

Abstract: We build a general and highly applicable clustering theory, which we call cross-entropy clustering (shortly CEC) which joins advantages of classical kmeans (easy implementation and speed) with those of EM (affine invariance and ability to adapt to clusters of desired shapes). Moreover, contrary to k-means and EM, CEC finds the optimal number of clusters by automatically removing groups which carry no information.Although CEC, similarly like EM, can be build on an arbitrary family of densities, in the most impo… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
54
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 71 publications
(54 citation statements)
references
References 32 publications
(25 reference statements)
0
54
0
Order By: Relevance
“…This is due to the so-called uniformity effect that causes these algorithms to generate clusters of similar sizes. This is especially vivid in case of centroidbased approaches [60], while density-based ones seem to display some robustness to it [52]. Clustering imbalanced data can be seen from various perspectives: as a process of group discovery on its own, as a method for reducing the complexity of given problem, or as a solution to analysis of the minority class structure.…”
Section: Semi-supervised and Unsupervised Learning From Imbalanced Datamentioning
confidence: 99%
“…This is due to the so-called uniformity effect that causes these algorithms to generate clusters of similar sizes. This is especially vivid in case of centroidbased approaches [60], while density-based ones seem to display some robustness to it [52]. Clustering imbalanced data can be seen from various perspectives: as a process of group discovery on its own, as a method for reducing the complexity of given problem, or as a solution to analysis of the minority class structure.…”
Section: Semi-supervised and Unsupervised Learning From Imbalanced Datamentioning
confidence: 99%
“…First, we introduce the cost function which will be optimized by the algorithm. Our approach is based on the CEC [18]. Therefore, we start with a short introduction to the method.…”
Section: Theoretical Background Of Ucecmentioning
confidence: 99%
“…More precisely, we use uniform pdf for independent variables, which is a product of univariate marginal pdfs, and the distribution will have generally the rectangle support. Furthermore, simpler optimization procedure known as Cross Entropy Clustering (CEC) [18] is used instead of EM.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Example 2 (Gaussian distribution [28]) For the multivariate Gaussian distribution, the entropy goes as the log determinant of the covariance; specifically, the differential entropy of a N -dimensional random variable with the density function…”
Section: Remarkmentioning
confidence: 99%