2019
DOI: 10.1007/978-981-15-1209-4_1
|View full text |Cite
|
Sign up to set email alerts
|

Estimating the Optimal Number of Clusters in Categorical Data Clustering by Silhouette Coefficient

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
61
0
3

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 100 publications
(64 citation statements)
references
References 14 publications
0
61
0
3
Order By: Relevance
“…The number of clusters was chosen using the silhouette method. It enables finding the optimal number of clusters and interpreting and validating the consistency within the clusters of data [44][45][46]. The silhouette method combines two clustering criteria, namely, compactness and separation.…”
Section: Methodsmentioning
confidence: 99%
“…The number of clusters was chosen using the silhouette method. It enables finding the optimal number of clusters and interpreting and validating the consistency within the clusters of data [44][45][46]. The silhouette method combines two clustering criteria, namely, compactness and separation.…”
Section: Methodsmentioning
confidence: 99%
“…If the number of running states of rolling bearing contained in the dataset is known, the number of clusters is determined by the number of running states of rolling bearing. If the number of running states of rolling bearing contained in the dataset is unknown, the number of clusters can be dynamically determined by elbow method [36] or silhouette coefficient method [37]. The pheromone heuristic factor α indicates the relative importance of pheromone intensity, if the value of α is too large, the random search ability of the algorithm is easily weakened.…”
Section: Experiments a Experimental Setupmentioning
confidence: 99%
“…Recent methods for categorical data consider the cluster centers as the expectation of a random variable associated with the data, in the assumption that this variable follows a Gaussian distribution from the statistical point of view [13,14,17,18,[46][47][48]. The goal is to find a method that can guarantee the consistency in the statistical interpretation of the cluster centers for categorical data as the mean for numerical data.…”
Section: Examplementioning
confidence: 99%
“…Finding the solution for the above two challenges in categorical data clustering is not an easy task. Many clustering algorithms for categorical data have been designed to remove the limitation, while keeping the advantages of k-means [10,14,17,18,28,31,34,[46][47][48]51]. In general, they have the same scheme as kmeans, except that they use different ways to define cluster centers (cluster representatives) and distance measures for categorical data.…”
Section: Introductionmentioning
confidence: 99%