2004
DOI: 10.1073/pnas.0406767101
|View full text |Cite
|
Sign up to set email alerts
|

Integrative analysis of genome-scale data by using pseudoinverse projection predicts novel correlation between DNA replication and RNA transcription

Abstract: We describe an integrative data-driven mathematical framework that formulates any number of genome-scale molecular biological data sets in terms of one chosen set of data samples, or of profiles extracted mathematically from data samples, designated the ''basis'' set. By using pseudoinverse projection, the molecular biological profiles of the data samples are least-squares-approximated as superpositions of the basis profiles. Reconstruction of the data in the basis simulates experimental observation of only th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

2
66
0

Year Published

2005
2005
2019
2019

Publication Types

Select...
7
2

Relationship

3
6

Authors

Journals

citations
Cited by 67 publications
(68 citation statements)
references
References 17 publications
2
66
0
Order By: Relevance
“…Integrating genome-scale proteins' DNA-binding data with cell cycle mRNA expression time course data from yeast, using pseudoinverse projection, we predicted a previously unknown correlation between DNA replication initiation and RNA transcription, which might be due to an undiscovered mechanism of regulation (21). Now we reveal a previously unknown physical principle by modeling DNA microarray data.…”
mentioning
confidence: 99%
“…Integrating genome-scale proteins' DNA-binding data with cell cycle mRNA expression time course data from yeast, using pseudoinverse projection, we predicted a previously unknown correlation between DNA replication initiation and RNA transcription, which might be due to an undiscovered mechanism of regulation (21). Now we reveal a previously unknown physical principle by modeling DNA microarray data.…”
mentioning
confidence: 99%
“…The method was recently used in a landmark study to predict novel breast cancer subtypes with distinct clinical outcomes (9), and it was found that the joint clustering of copy number and gene expression profiles resolved the considerable heterogeneity of the expression-only subgroups. Other approaches on data integration that have emerged in recent years include generalized data decomposition methods (10,11) and nonparametric Bayesian models (12). However, two major challenges have not yet been fully addressed.…”
mentioning
confidence: 99%
“…One study found a chromosome-wide pattern of correlation between DNA binding of cohesin, a protein that holds together sister chromatids, and convergent transcription of genes, suggesting a mechanism for the relocation of cohesion during DNA replication (26). Another study discovered a genome-wide pattern of correlation between the activation of replication origins and minima or even shutdown of the transcription of adjacent genes during the cell-cycle stage G 1 , by using pseudoinverse projection to map DNA-binding of replication initiation proteins onto the SVD-and GSVDreconstructed phase-spaces of yeast cell-cycle mRNA expression (6). This pattern might be explained by a previously unknown mechanism of regulation, which is in agreement with current understanding of replication initiation (27) and is supported by recent experimental results (28).…”
mentioning
confidence: 99%
“…The operations, such as data classification and reconstruction in subspaces of selected patterns, might simulate experimental observation of the correlations and possibly also causal coordination of these activities. Such models were recently created from DNA microarray data by using singular value decomposition (SVD) (4) and generalized SVD (GSVD) (5), and their ability to predict previously unknown biological as well as physical principles was demonstrated (6,7).…”
mentioning
confidence: 99%