2005
DOI: 10.1007/11552253_5
|View full text |Cite
|
Sign up to set email alerts
|

Kernel K-Means for Categorical Data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2010
2010
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(14 citation statements)
references
References 9 publications
0
11
0
Order By: Relevance
“…We chose a kernel, proposed in Couto (2005), based on the Hamming distance which measures the minimum number of substitutions required to change one observation into another one. Naturally, pgpEM and kernel k-means worked on the same kernel to have a fair Harmonic 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Harmonic 2 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Harmonic 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +…”
Section: Clustering Of Categorical Data: the House-vote Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…We chose a kernel, proposed in Couto (2005), based on the Hamming distance which measures the minimum number of substitutions required to change one observation into another one. Naturally, pgpEM and kernel k-means worked on the same kernel to have a fair Harmonic 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Harmonic 2 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Harmonic 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +…”
Section: Clustering Of Categorical Data: the House-vote Datasetmentioning
confidence: 99%
“…It turns out that pgpEM is significantly better than kernel k-means to cluster this kind of data. To make pgpDA able to deal with such data, we built a combined kernel by mixing a kernel based on the Hamming distance (Couto 2005) for the categorical features and a RBF kernel for the quantitative data. We chose to combine both kernels simply as follows:…”
Section: Pca Function 2 (Percentage Of Variability 125 )mentioning
confidence: 99%
“…For the numerical datasets, the Gaussian kernel is applied, while the Hamming kernel [7] is used for categorical datasets. For each of the three methods, we used a variety of parameter settings.…”
Section: B Quantitative Resultsmentioning
confidence: 99%
“…Most remarkably, the kernel trick adaptation also allows these inner product reliant methods to be directly applied to non-numeric or mixed-type data, once appropriate kernels have been defined for these data types. As examples, here we can mention outlier detection techniques for categorical or mixed-attribute data such as [7] and [8].…”
Section: Introductionmentioning
confidence: 99%
“…Amir et al [61] offered a cost function and distance measure for clustering datasets with mixed data (datasets with numerical and categorical data) based on co-occurrences of values. In [62], a kernel function based on "hamming distance" [62] was proposed for embedding categorical data. The kernel-k-means provides an add-on to the k-means clustering that is designed to find clusters in a feature space where distances are calculated via kernel functions.…”
Section: K-means Variants For Solving the Problem Of Data Issuementioning
confidence: 99%