2020
DOI: 10.1093/bib/bbaa166
|View full text |Cite
|
Sign up to set email alerts
|

A powerful framework for an integrative study with heterogeneous omics data: from univariate statistics to multi-block analysis

Abstract: High-throughput data generated by new biotechnologies require specific and adapted statistical treatment in order to be efficiently used in biological studies. In this article, we propose a powerful framework to manage and analyse multi-omics heterogeneous data to carry out an integrative analysis. We have illustrated this using the mixOmics package for R software as it specifically addresses data integration issues. Our work also aims at applying the most recent functionalities of mixOmics to real datasets. A… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

5
3

Authors

Journals

citations
Cited by 13 publications
(17 citation statements)
references
References 43 publications
2
14
0
Order By: Relevance
“…We explored how sensitive our results are to different specifications of the design matrices and found that these resulted in similar predictive abilities, but selected different numbers of only partially overlapping PGSs and CpGs (Appendices C and D). In line with expectations, the predictive ability of the null design matrix was slightly better than for the empirical design matrix, reflecting that such a model focuses on selecting discriminatory variables (Singh et al 2019;Duruflé et al 2020). We expected that the model with the full design model sacrificed predictive accuracy to select discriminatory variables that are also highly correlated, but this model had the lowest classification error rate as compared to the models with an empirical or null design matrix in the test and clinical data.…”
Section: Discussionsupporting
confidence: 62%
See 2 more Smart Citations
“…We explored how sensitive our results are to different specifications of the design matrices and found that these resulted in similar predictive abilities, but selected different numbers of only partially overlapping PGSs and CpGs (Appendices C and D). In line with expectations, the predictive ability of the null design matrix was slightly better than for the empirical design matrix, reflecting that such a model focuses on selecting discriminatory variables (Singh et al 2019;Duruflé et al 2020). We expected that the model with the full design model sacrificed predictive accuracy to select discriminatory variables that are also highly correlated, but this model had the lowest classification error rate as compared to the models with an empirical or null design matrix in the test and clinical data.…”
Section: Discussionsupporting
confidence: 62%
“…A 'null' design matrix denotes weak or no correlations among omics blocks by setting values close to or equal to zero. The full design matrix optimizes correlations among the omics blocks, while the null design matrix optimizes the discrimination between samples (Rohart et al 2017;Singh et al 2019;Duruflé et al 2020). We can also specify a design matrix with the empirical correlations among the omics blocks.…”
Section: Phase 3: Multi-omics Analysesmentioning
confidence: 99%
See 1 more Smart Citation
“…The most commonly used integration tools include “mixOmics,” “tRanslatome,” “R.JIVE,” and “iClusterPlus”. First, “mixOmics” was a powerful framework with four kinds of datasets (metabolomics, phenomics, cell wall proteomics, and transcriptomics) ( Durufle et al, 2021 ). Then, a deep neural network named “tRanslatome” was proposed which can predict the protein structure from input amino acid sequences but not for disordered proteins ( Du et al, 2021 ).…”
Section: Introductionmentioning
confidence: 99%
“…Associations and correlations across omics levels have been reported between the genome and epigenome (van Dongen et al, 2016; Min et al, 2021), the genome and the metabolome (Hagenbeek, Pool, et al, 2020) and the epigenome and the metabolome (Gomez-Alonso et al, 2021). To optimally analyze multi-omics data in association and etiological studies, dedicated statistical treatment of simultaneous omics influences is required (Durufle et al, 2020), with simultaneous modeling completing approaches in which separate analyses are applied to each omics level. Such innovative multi-omics analyses may result in novel insights and uncover new biological pathways underlying traits and diseases (Rajasundaram & Selbig, 2016).…”
Section: Introductionmentioning
confidence: 99%