2007
DOI: 10.1109/tpami.2007.53
|View full text |Cite
|
Sign up to set email alerts
|

On the Impact of Dissimilarity Measure in k-Modes Clustering Algorithm

Abstract: This correspondence describes extensions to the k-modes algorithm for clustering categorical data. By modifying a simple matching dissimilarity measure for categorical objects, a heuristic approach was developed in [4], [12] which allows the use of the k-modes paradigm to obtain a cluster with strong intrasimilarity and to efficiently cluster large categorical data sets. The main aim of this paper is to rigorously derive the updating formula of the k-modes clustering algorithm with the new dissimilarity measur… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
102
0

Year Published

2010
2010
2021
2021

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 187 publications
(108 citation statements)
references
References 8 publications
0
102
0
Order By: Relevance
“…The algorithm proposed in this paper is different from k-Median algorithm in reference [2], this paper uses k-Modes clustering algorithm [3] to label unlabeled data in the process of incrementally creating decision tree. Because k-Modes clustering algorithm is suitable for dealing with discrete attributes, we use discretization method [4] for dealing with continuous attributes.…”
Section: Semi-supervised Learning Methods Based On Kmodes Algorithm Anmentioning
confidence: 99%
“…The algorithm proposed in this paper is different from k-Median algorithm in reference [2], this paper uses k-Modes clustering algorithm [3] to label unlabeled data in the process of incrementally creating decision tree. Because k-Modes clustering algorithm is suitable for dealing with discrete attributes, we use discretization method [4] for dealing with continuous attributes.…”
Section: Semi-supervised Learning Methods Based On Kmodes Algorithm Anmentioning
confidence: 99%
“…Dissimilarity based k-mode [18] is also the one of the K-Mode Algorithm in which a new dissimilarity measure is proposed in which the modes of clusters were updated in each iteration ad utilizes some theorems to update the mode of the cluster.…”
Section: Literature Surveymentioning
confidence: 99%
“…The smaller the number of mismatches, the more similar are two objects. This measure is also a kind of generalized Hamming distance (Ng et al, 2007).…”
Section: Definitions and Notationsmentioning
confidence: 99%