2014
DOI: 10.1038/nbt.3000
|View full text |Cite
|
Sign up to set email alerts
|

Detecting and correcting systematic variation in large-scale RNA sequencing data

Abstract: High-throughput RNA sequencing (RNA-seq) enables comprehensive scans of entire transcriptomes, but best practices for analyzing RNA-seq data have not been fully defined, particularly for data collected with multiple sequencing platforms or at multiple sites. Here we used standardized RNA samples with built-in controls to examine sources of error in large-scale RNA-seq studies and their impact on the detection of differentially expressed genes (DEGs). Analysis of variations in guanine-cytosine content, gene cov… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

7
136
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 172 publications
(145 citation statements)
references
References 59 publications
7
136
0
Order By: Relevance
“…To fully exploit single-cell RNA-seq data, we have to account for the random noise inherent to such data sets 20 and, equally important, to account for different hidden factors that might result in gene expression heterogeneity. Although the importance of accounting for unobserved factors is well established in bulk RNA-seq studies [21][22][23] , robust approaches to detect and account for confounding factors in single-cell RNA-seq studies remain to be developed. Here, we describe a computational approach that uses latent variable models to reconstruct such hidden factors from the observed data.…”
Section: A N a Ly S I Smentioning
confidence: 99%
“…To fully exploit single-cell RNA-seq data, we have to account for the random noise inherent to such data sets 20 and, equally important, to account for different hidden factors that might result in gene expression heterogeneity. Although the importance of accounting for unobserved factors is well established in bulk RNA-seq studies [21][22][23] , robust approaches to detect and account for confounding factors in single-cell RNA-seq studies remain to be developed. Here, we describe a computational approach that uses latent variable models to reconstruct such hidden factors from the observed data.…”
Section: A N a Ly S I Smentioning
confidence: 99%
“…In fact, the fraction of variation across laboratories correlates with the GC content of each gene (Fig. 3c), and recapitulates the known role of GC content with reproducibility of RNA-seq data [8, [41][42][43].…”
Section: Analysis Of Seqc Rna-seq Datasetmentioning
confidence: 99%
“…Both RNA-Seq and microarrays are affected by systematic variations (Park et al, 2003;Oshlack and Wakefield, 2009;Zheng et al, 2011;Li et al, 2014b). Therefore, genomewide expression results generated by either technique need to be normalized before analysis (Dillies et al, 2013;Li et al, 2015b).…”
mentioning
confidence: 99%