2017
DOI: 10.1038/s41598-017-01064-0
|View full text |Cite
|
Sign up to set email alerts
|

Network-aided Bi-Clustering for discovering cancer subtypes

Abstract: Bi-clustering is a widely used data mining technique for analyzing gene expression data. It simultaneously groups genes and samples of an input gene expression data matrix to discover bi-clusters that relevant samples exhibit similar gene expression profiles over a subset of genes. The discovered bi-clusters bring insights for categorization of cancer subtypes, gene treatments and others. Most existing bi-clustering approaches can only enumerate bi-clusters with constant values. Gene interaction networks can h… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(14 citation statements)
references
References 62 publications
0
12
0
Order By: Relevance
“…KnowEnG makes this analytic process more rigorous by adapting its statistical tools to be directly guided by the vast data in such public repositories of gene annotations and interactions. In doing so, KnowEnG builds on a rich tradition of knowledge-guided analysis methods that have been previously reported for a variety of biological research tasks including (1) clustering of samples into cancer subtypes [11][12][13][14], (2) finding markers and drivers of disease [15][16][17][18][19][20], (3) prediction of patient survival [21,22] or cancer metastases [23], (4) characterization of experimental gene sets [24][25][26][27][28], and (5) prediction of gene functions [29][30][31]. KnowEnG also breaks the logistical barriers associated with utilizing large databases of prior knowledge, by co-locating its "knowledge-guided analysis" tools with a diverse knowledgebase compiled from numerous popular repositories.…”
Section: Knowledge Network-guided Analysismentioning
confidence: 99%
“…KnowEnG makes this analytic process more rigorous by adapting its statistical tools to be directly guided by the vast data in such public repositories of gene annotations and interactions. In doing so, KnowEnG builds on a rich tradition of knowledge-guided analysis methods that have been previously reported for a variety of biological research tasks including (1) clustering of samples into cancer subtypes [11][12][13][14], (2) finding markers and drivers of disease [15][16][17][18][19][20], (3) prediction of patient survival [21,22] or cancer metastases [23], (4) characterization of experimental gene sets [24][25][26][27][28], and (5) prediction of gene functions [29][30][31]. KnowEnG also breaks the logistical barriers associated with utilizing large databases of prior knowledge, by co-locating its "knowledge-guided analysis" tools with a diverse knowledgebase compiled from numerous popular repositories.…”
Section: Knowledge Network-guided Analysismentioning
confidence: 99%
“…By focusing on the disease-specific network, it was possible to identify survival-associated subtypes in uterine corpus endometrial carcinoma (UCEC) cancer that were not detected by the original NBS method. There are also similar approaches [102,103] that integrate network architecture information with gene expression profiles (as opposed to somatic mutation data like in NBS) to assign weights to genes in a gene by patient matrix which is then clustered to stratify patients into groups and discover cancer subtypes. Smoothing expression across the network emphasizes groups of related genes with similar expression patterns across samples, thereby identifying more stable signals within the expression data.…”
Section: Network Analysis Across Tumor Cohortsmentioning
confidence: 99%
“…Few methods exist that can utilize molecular interaction networks for patient stratification. Two integer linear programming methods were suggested (Yu et al, 2017, Liu et al, 2014 both of which rely on the GeneRank (Morrison et al, 2005) algorithm to incorporate network information. GeneRank depends on a parameter θ describing the influence of the network whose choice is not straightforward and was shown to have a notable impact on the results (Yu et al, 2017).…”
Section: Introductionmentioning
confidence: 99%
“…Two integer linear programming methods were suggested (Yu et al, 2017, Liu et al, 2014 both of which rely on the GeneRank (Morrison et al, 2005) algorithm to incorporate network information. GeneRank depends on a parameter θ describing the influence of the network whose choice is not straightforward and was shown to have a notable impact on the results (Yu et al, 2017). None of the above methods actively encourage connected subnetworks as solutions and are thus not suited for discovering disease modules with mechanistic interpretation.…”
Section: Introductionmentioning
confidence: 99%