2008
DOI: 10.1016/j.inffus.2006.05.006
|View full text |Cite
|
Sign up to set email alerts
|

k-ANMI: A mutual information based clustering algorithm for categorical data

Abstract: Clustering categorical data is an integral part of data mining and has attracted much attention recently. In this paper, we present k-ANMI, a new efficient algorithm for clustering categorical data. The k-ANMI algorithm works in a way that is similar to the popular kmeans algorithm, and the goodness of clustering in each step is evaluated using a mutual information based criterion (namely, average normalized mutual information -ANMI) borrowed from cluster ensemble. This algorithm is easy to implement, requirin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
27
0

Year Published

2011
2011
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 64 publications
(27 citation statements)
references
References 29 publications
0
27
0
Order By: Relevance
“…Some of the most widely used internal criteria are Calinski-Harabasz index [28], Davies-Bouldin index [29] and Dunn's index [30], whereas some external criteria are F-measure [31], purity [32], normalised mutual information [33] and a measure based on hungarian algorithm [34]. All the aforementioned criteria have been used in the proposed algorithm, some of them both for optimisation and evaluating the performance of the algorithm and some only for evaluation.…”
Section: Optimisation Of the Solutionsmentioning
confidence: 99%
See 1 more Smart Citation
“…Some of the most widely used internal criteria are Calinski-Harabasz index [28], Davies-Bouldin index [29] and Dunn's index [30], whereas some external criteria are F-measure [31], purity [32], normalised mutual information [33] and a measure based on hungarian algorithm [34]. All the aforementioned criteria have been used in the proposed algorithm, some of them both for optimisation and evaluating the performance of the algorithm and some only for evaluation.…”
Section: Optimisation Of the Solutionsmentioning
confidence: 99%
“…• Normalised mutual information [33] is also used as an external criterion for evaluating clustering algorithms, and is defined as:…”
Section: Optimisation Of the Solutionsmentioning
confidence: 99%
“…However, its implementation has been applied in data mining to group similar data records [8]. Recently, Normalized Mutual Information (NMI) has been proposed for feature selection with the advantage of reduced complexity of features [9].…”
Section: Introductionmentioning
confidence: 99%
“…The issue of database clustering with categorical variables has received intensive attention ( [1]- [8]) along with other publications in the same issue. The categorical data come from different areas of research, both social and nature sciences; this type of variable does not present a natural ordering for their possible values, results in the difficulty in clustering process.…”
Section: Introductionmentioning
confidence: 99%