2017
DOI: 10.1007/978-3-319-59126-1_36
|View full text |Cite
|
Sign up to set email alerts
|

Spectral Clustering Using PCKID – A Probabilistic Cluster Kernel for Incomplete Data

Abstract: Abstract. In this paper, we propose PCKID, a novel, robust, kernel function for spectral clustering, specifically designed to handle incomplete data. By combining posterior distributions of Gaussian Mixture Models for incomplete data on different scales, we are able to learn a kernel for incomplete data that does not depend on any critical hyperparameters, unlike the commonly used RBF kernel. To evaluate our method, we perform experiments on two real datasets. PCKID outperforms the baseline methods for all fra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 20 publications
0
4
0
Order By: Relevance
“…However, Souto et al [33] later argued that from their experiments, this superiority is non-existent, backing up their conclusion with the rationale that gene expression being highly correlated and characterized by very close values, imputing with a mean will have minimal effect on the shape of the data's distribution. Løkse et al [34] introduced a new kernel function which learns the similarities between data points from the data's fitted mixture models, inherently taking care of the missing value problem. They then use this kernel function for spectral clustering, performing kmeans clustering on the spectral clustering output.…”
Section: Multi-stage Clusteringmentioning
confidence: 99%
“…However, Souto et al [33] later argued that from their experiments, this superiority is non-existent, backing up their conclusion with the rationale that gene expression being highly correlated and characterized by very close values, imputing with a mean will have minimal effect on the shape of the data's distribution. Løkse et al [34] introduced a new kernel function which learns the similarities between data points from the data's fitted mixture models, inherently taking care of the missing value problem. They then use this kernel function for spectral clustering, performing kmeans clustering on the spectral clustering output.…”
Section: Multi-stage Clusteringmentioning
confidence: 99%
“…The PCK has previously been used for semi-supervised learning [22] and spectral clustering [23]. Additionally, variations of the method for handling missing data have been proposed for both time series [38] and vectorial data [35].…”
Section: Probabilistic Cluster Kernelmentioning
confidence: 99%
“…We train PCK by fitting GMMs on a subset of 200 training samples using parameters Q = G = 30. These parameters are sufficiently large to ensure robust results [35]. Once trained, the GMM models are applied to the remaining data to calculate the whole kernel matrix.…”
Section: Experimental Settingmentioning
confidence: 99%
“…By doing so, the entire framework is grounded within the theoretically well understood kernel methods. Moreover, spectral clustering is considered a state-of-the-art clustering algorithm and has been successfully utilized in many applications [25,26,27].…”
Section: Introductionmentioning
confidence: 99%