2018
DOI: 10.1093/bib/bby027
|View full text |Cite
|
Sign up to set email alerts
|

Comparison and evaluation of integrative methods for the analysis of multilevel omics data: a study based on simulated and experimental cancer data

Abstract: Integrative analysis aims to identify the driving factors of a biological process by the joint exploration of data from multiple cellular levels. The volume of omics data produced is constantly increasing, and so too does the collection of tools for its analysis. Comparative studies assessing performance and the biological value of results, however, are rare but in great demand. We present a comprehensive comparison of three integrative analysis approaches, sparse canonical correlation analysis (sCCA), non-neg… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
25
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 23 publications
(26 citation statements)
references
References 49 publications
0
25
0
Order By: Relevance
“…To assess and compare the statistical properties of the global test with existing state of the art methods, a simulation study similar to that of Pucher et al were conduted, [12] which compared several integrative analysis methods for multi-omics data. Briefly, first two data matrices were simulated, from normal distribution and beta distribution respectively, using the same formula as in Equations (1) and (2) of Pucher et al [12] These two datasets represented two types of molecular profiles, for example, gene expressions and DNA methylation levels, respectively. The dimensions of these data matrices were set at 1600 features (genes) x 200 samples and 2400 features (DNA methylation probes) x 200 samples, respectively.…”
Section: Simulation Studymentioning
confidence: 99%
See 1 more Smart Citation
“…To assess and compare the statistical properties of the global test with existing state of the art methods, a simulation study similar to that of Pucher et al were conduted, [12] which compared several integrative analysis methods for multi-omics data. Briefly, first two data matrices were simulated, from normal distribution and beta distribution respectively, using the same formula as in Equations (1) and (2) of Pucher et al [12] These two datasets represented two types of molecular profiles, for example, gene expressions and DNA methylation levels, respectively. The dimensions of these data matrices were set at 1600 features (genes) x 200 samples and 2400 features (DNA methylation probes) x 200 samples, respectively.…”
Section: Simulation Studymentioning
confidence: 99%
“…To simulate true positive pathways with differential expressions in both simulated gene expression and methylation datasets, 5 pathways were selected randomly and added treatment effects to samples in the treated group for a selected subset of features (parameter p path ) within each pathway. There were a total of 16 simulation scenarios, corresponding to different percentage of true positive features within a selected pathway (p path = {10%, 15%, 20%, 50%}) and different effect sizes added to these features ( = (0.2, 0.3, 0.4, 0.6) relative to scaled standard deviation (see details in Pucher et al [12] ). For each simulation scenario, a total of 100 pairs of two datasets were simulated.…”
Section: Simulation Studymentioning
confidence: 99%
“…The lack of common methodologies and terminologies can transform this synergy into a further level of complexity in the process of data integration (51). As observed in (52,53), specific technological limits, noise levels and variability ranges affect the different omics, and thus confounding the underlying biological signals, yielding that really integrative analysis is still very rare, while different methods often discover different kinds of patterns, as evidenced by the lack of consistency in the published results, although efforts in this direction have started appearing (54,55).…”
Section: Background and Related Workmentioning
confidence: 99%
“…In the early-integration approach, also known as juxtaposition-based, the multi-omics datasets are first concatenated into one matrix. To deal with the high-dimensionality of the joint dataset, these methods generally adopt matrix factorization (68,53,55,52), statistical (46,69,70,59,57,44,71,72,73,55), and machine learning tools (74,73,55). Although the dimensionality reduction procedure is necessary and may improve the predictive performance, it can also cause the loss of key information (66).…”
Section: Background and Related Workmentioning
confidence: 99%
“…Recently, Serra et al [53] proposed a framework for combining different data profiles of multi-view datasets by integrating several clustering results done on each profile through nonmatrix factorization. Pucher et al [60] provided a comprehensive review and comparative study of the three integrative methods (viz., non-negative matrix factorization (NMF), sparse canonical correlation analysis (sCCA) and logic data mining MicroArray Logic Analyzer (MALA)) on simulated data as well as real omics profile. In addition, there are many deep learning techniques that were also developed to handle biological data.…”
Section: Machine Learning and Rule Mining Approaches For Gene Inactivmentioning
confidence: 99%