2015
DOI: 10.1093/bioinformatics/btv244
|View full text |Cite
|
Sign up to set email alerts
|

Integrating different data types by regularized unsupervised multiple kernel learning with application to cancer subtype discovery

Abstract: Motivation: Despite ongoing cancer research, available therapies are still limited in quantity and effectiveness, and making treatment decisions for individual patients remains a hard problem. Established subtypes, which help guide these decisions, are mainly based on individual data types. However, the analysis of multidimensional patient data involving the measurements of various molecular features could reveal intrinsic characteristics of the tumor. Large-scale projects accumulate this kind of data for vari… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
98
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
3
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 149 publications
(100 citation statements)
references
References 14 publications
(16 reference statements)
2
98
0
Order By: Relevance
“…This distinction has implications on the fitting procedures and their scalability. Frequentist formulations of clustering models have generally been those that assume common cluster boundaries across data sources [Kormaksson et al, 2012, Speicher and Pfeifer, 2015, Shen et al, 2009. Algorithms with similar aims that decompose aggregations of data matrices into lower dimension spaces have also been developed [Chalise and Fridley, 2017, Chalise et al, 2014, Chen et al, 2008.…”
Section: Introductionmentioning
confidence: 99%
“…This distinction has implications on the fitting procedures and their scalability. Frequentist formulations of clustering models have generally been those that assume common cluster boundaries across data sources [Kormaksson et al, 2012, Speicher and Pfeifer, 2015, Shen et al, 2009. Algorithms with similar aims that decompose aggregations of data matrices into lower dimension spaces have also been developed [Chalise and Fridley, 2017, Chalise et al, 2014, Chen et al, 2008.…”
Section: Introductionmentioning
confidence: 99%
“…We compare PAMOGK with eight other multi-omics methods. These include kmeans [26], MCCA [54], LRACluster [55], rMKL-LPP [45], iClusterBayes [29], PINS [34], SNF [51], and finally Spectral Clustering [58]. These methods cover all methods that are included in a recent comparative benchmark study by Rappoport et al [36] with the exception of multiNMF [24], which we are not able to run properly.…”
Section: Comparison With the State-of-the Art Multi-omics Methods Permentioning
confidence: 99%
“…Several generic multi-view kernel clustering methods (reviewed in [56]) have been developed where some have been applied to cancer subtyping. rMKL-LPP [45] extends the [23] multi-view kernel framework to the multi-omics clustering. A kernel matrix is computed from each omic data type, and a linear combination of kernels is sought for the clustering of the patients in kernel k-means.…”
Section: Introductionmentioning
confidence: 99%
“…In the early-integration approach, also known as juxtaposition-based, the multi-omics datasets are first concatenated into one matrix. To deal with the high-dimensionality of the joint dataset, these methods generally adopt matrix factorization (68,53,55,52), statistical (46,69,70,59,57,44,71,72,73,55), and machine learning tools (74,73,55). Although the dimensionality reduction procedure is necessary and may improve the predictive performance, it can also cause the loss of key information (66).…”
Section: Background and Related Workmentioning
confidence: 99%
“…Conversely, our INF method for omics data integration is an improvement of the popular Similarity Network Fusion (SNF) approach (5), which has inspired several studies in the scientific literature, specifically in cancer genomics (97,98,99,100,74,82,101). SNF maximizes the shared or correlated information between multiple datasets by combining data through inference of a joint network-based model, accounting for how informative each data type is to the observed similarity between samples.…”
Section: Integrative Network Fusionmentioning
confidence: 99%