2020
DOI: 10.1101/2020.01.13.904730
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ComBat-Seq: batch effect adjustment for RNA-Seq count data

Abstract: The benefit of integrating batches of genomic data to increase statistical power in differential expression is often hindered by batch effects, or unwanted variation in data caused by differences in technical factors across batches. It is therefore critical to effectively address batch effects in genomic data. Many existing methods for batch effect adjustment assume continuous, bellshaped Gaussian distributions for data. However in RNA-Seq studies where data are skewed, over-dispersed counts, this assumption i… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
50
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 54 publications
(50 citation statements)
references
References 19 publications
0
50
0
Order By: Relevance
“…The RNA sequencing (RNA-seq) data has 60,483 transcripts in total, and 13,339 transcripts were mapped to the ensemble identifiers of all the pathway information in the 1335 pathways. Since the RNA-seq data had different batches, a batch effect adjustment was performed with Combat-seq program [ 29 ]. After the adjustment, the RNA-seq data were normalized using the quantile normalization method.…”
Section: Resultsmentioning
confidence: 99%
“…The RNA sequencing (RNA-seq) data has 60,483 transcripts in total, and 13,339 transcripts were mapped to the ensemble identifiers of all the pathway information in the 1335 pathways. Since the RNA-seq data had different batches, a batch effect adjustment was performed with Combat-seq program [ 29 ]. After the adjustment, the RNA-seq data were normalized using the quantile normalization method.…”
Section: Resultsmentioning
confidence: 99%
“…RNAseq analysis was done on the untreated samples from two independent datasets where larvae from one sample set were reared in egg water (Gene Expression Omnibus (GEO) accession number GSE151291) and the other was reared in embryo medium (GEO accession number GSE156420). To adjust the batch effect, we adopted a method designed for RNA-seq count data implemented by Combat-seq [48]. The method keeps the negative binomial distribution of RNA-seq reads count and the integer nature of the data.…”
Section: Methodsmentioning
confidence: 99%
“…We analyzed RNA-seq data from enrollment PaxGene tubes from a subset of 23 severely malnourished individuals with TB and 15 severely malnourished tuberculin skin test positive (TST ≥5 mm) household contacts as previously described [19]. The data were batch corrected using ComBat-Seq [47,48] ( Supplementary Figure 1). Differential expression between TB and LTBI samples produced 6706 differentially expressed features using an adjusted p-value (FDR) cutoff of 0.01, including 4913 protein coding genes, 1052 lncRNAs, 135 T cell receptive elements, 19 immunoglobulin genes, and 13 miR-NAs.…”
Section: Rna-sequencing Data Generation and Processingmentioning
confidence: 99%