2022
DOI: 10.1101/2022.10.19.507549
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Correcting batch effects in large-scale multiomic studies using a reference-material-based ratio method

Abstract: Batch effects are notorious technical variations that are common in multiomic data and may result in misleading outcomes. With the era of big data, tackling batch effects in multiomic integration is urgently needed. As part of the Quartet Project for quality control and data integration of multiomic profiling, we comprehensively assess the performances of seven batch-effect correction algorithms (BECAs) for mitigating the negative impact of batch effects in multiomic datasets, including transcriptomics, proteo… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
8
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 9 publications
(8 citation statements)
references
References 46 publications
(86 reference statements)
0
8
0
Order By: Relevance
“…Four accompanying papers detailed the establishment of the DNA 66 , RNA 67 , protein 68 , and metabolite 69 reference materials, reference datasets, and QC methods for each type of omics profiling (genomics, transcriptomics, proteomics, and metabolomics, respectively) and their applications. Another paper 70 was dedicated to addressing the widespread problem of batch effects present in each and every type of omics data. We also developed the Quartet Data Portal (chinese-quartet.org) 71 for the community to conveniently access and share the Quartet multiomics resources according to the regulation of the Human Genetic Resources Administration of China (HGRAC).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Four accompanying papers detailed the establishment of the DNA 66 , RNA 67 , protein 68 , and metabolite 69 reference materials, reference datasets, and QC methods for each type of omics profiling (genomics, transcriptomics, proteomics, and metabolomics, respectively) and their applications. Another paper 70 was dedicated to addressing the widespread problem of batch effects present in each and every type of omics data. We also developed the Quartet Data Portal (chinese-quartet.org) 71 for the community to conveniently access and share the Quartet multiomics resources according to the regulation of the Human Genetic Resources Administration of China (HGRAC).…”
Section: Resultsmentioning
confidence: 99%
“…A striking finding of our study is that the multiomics profiling data at the “absolute” level, such as FPKM in transcriptomics, FOT (fraction of total) in MS-based proteomics, and relative peak areas in metabolomics from a single sample, is inherently irreproducible across platforms, labs, or batches, leading to the notorious “batch effects”. Such batch effects, usually confounded with study factors of interests, hinder the discovery of reliable biomarkers either by mistaking batch differences as biological signals or by attenuating biological signals with the incorrect use of “batch-effect correction” methods (see details in an accompanying paper 70 ). The presence of batch effects makes the horizontal integration of diverse datasets from the same omics type impossible, as can be seen from the lack of capability of correctly clustering the Quartet samples ( Fig.…”
Section: Discussionmentioning
confidence: 99%
“…Four accompanying papers detailed the establishment of the DNA 66 , RNA 67 , protein 68 , and metabolite 69 reference materials, reference datasets, and QC methods for each type of omics profiling (genomics, transcriptomics, proteomics, and metabolomics, respectively) and their applications. Another paper 70 was dedicated to addressing the widespread problem of batch effects present in each and every type of omics data. We also developed the Quartet Data Portal (chinese-quartet.org) 71 for the community to conveniently access and share the Quartet multiomics resources according to the regulation of the Human Genetic Resources Administration of China (HGRAC).…”
Section: Resultsmentioning
confidence: 99%
“…4b-4f) if one or more common reference materials are profiled across batches. Our companion work has found that "ratio" data by scaling the absolute feature values of study samples relative to those of concurrently measured reference sample(s) on a feature-by-feature basis could effectively mitigate the widespread problems of batch effects, in epigenomics, transcriptomics, proteomics, and metabolomics datasets 29,46 .…”
Section: Discussionmentioning
confidence: 99%