2016
DOI: 10.1002/sim.6866
|View full text |Cite
|
Sign up to set email alerts
|

Integrative and regularized principal component analysis of multiple sources of data

Abstract: Integration of data of disparate types has become increasingly important to enhancing the power for new discoveries by combining complementary strengths of multiple types of data. One application is to uncover tumor subtypes in human cancer research, in which multiple types of genomic data are integrated, including gene expression, DNA copy number and DNA methylation data. In spite of their successes, existing approaches based on joint latent variable models require stringent distributional assumptions and may… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
17
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 13 publications
(18 citation statements)
references
References 47 publications
0
17
0
Order By: Relevance
“…For DUVR‐induced gene expression modulation and secreted protein analysis the means were compared using two‐tailed Student's t ‐test, difference was considered significant at P < 0.05 . For sunscreen ranking, principal component analysis (PCA) was conducted by normalizing the modulation ratio of all the genes to a notionally common scale and analysed by the principal of dimension reduction as previously described . Dimension reduction (factor analysis) was run with SPSS, Chicago, IL, U.S.A., components were extracted based on eigenvalues ( λ 1 , λ 2 , … λ n ) >1 and factor score of each component was got ( F 1 , F 2 , … F n ).…”
Section: Methodsmentioning
confidence: 99%
“…For DUVR‐induced gene expression modulation and secreted protein analysis the means were compared using two‐tailed Student's t ‐test, difference was considered significant at P < 0.05 . For sunscreen ranking, principal component analysis (PCA) was conducted by normalizing the modulation ratio of all the genes to a notionally common scale and analysed by the principal of dimension reduction as previously described . Dimension reduction (factor analysis) was run with SPSS, Chicago, IL, U.S.A., components were extracted based on eigenvalues ( λ 1 , λ 2 , … λ n ) >1 and factor score of each component was got ( F 1 , F 2 , … F n ).…”
Section: Methodsmentioning
confidence: 99%
“…To understand and demonstrate the crucial need for addressing the second problem of data type selection, we surveyed 58 integration methods for cancer subtyping proposed from 2009 to 2019, and the result is summarized in Fig 1 where gene expression is treated as the same as mRNA expression and miRNA expression is placed into the group of epigenome based on observations from [7]. We summarized part of these 58 integration methods with the omics data they used in Fig 1A , and we can see, the data combinations used in these methods [2,[8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24] are significantly inconsistent. For example, Fig 1B shows that while the mRNA expression data were used by 56 of the 58 methods, each of the other data types was only used by at most nearly half of these methods.…”
Section: Introductionmentioning
confidence: 99%
“…To capture joint variation, concatenated PCA assumes Xi=UiVT for each matrix Xi, that is, the scores are shared across matrices. The iCluster (Shen et al ., ) and irPCA (Liu et al ., ) approaches make this assumption for the integration of multisource biomedical data. Alternatively, more flexible approaches allow for structured variation that may be shared across matrices or specific to individual matrices.…”
Section: Introductionmentioning
confidence: 99%