2010
DOI: 10.1534/genetics.110.114983
|View full text |Cite
|
Sign up to set email alerts
|

Statistical Design and Analysis of RNA Sequencing Data

Abstract: Next-generation sequencing technologies are quickly becoming the preferred approach for characterizing and quantifying entire genomes. Even though data produced from these technologies are proving to be the most informative of any thus far, very little attention has been paid to fundamental design aspects of data collection and analysis, namely sampling, randomization, replication, and blocking. We discuss these concepts in an RNA sequencing framework. Using simulations we demonstrate the benefits of collectin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
301
0
4

Year Published

2011
2011
2021
2021

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 351 publications
(313 citation statements)
references
References 45 publications
3
301
0
4
Order By: Relevance
“…Exhaustive enumeration of imprinted genes will require a large community- wide effort, including multiple replicates from multiple lines, with samples of different tissues and developmental time points. If the results are to be interpreted with confidence on the basis of RNA-seq data alone, a blocked and replicated design is essential (Auer and Doerge 2010). Our intention here was to apply RNA-seq in a simple, unreplicated design to serve as a means of nominating candidates for subsequent validation.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Exhaustive enumeration of imprinted genes will require a large community- wide effort, including multiple replicates from multiple lines, with samples of different tissues and developmental time points. If the results are to be interpreted with confidence on the basis of RNA-seq data alone, a blocked and replicated design is essential (Auer and Doerge 2010). Our intention here was to apply RNA-seq in a simple, unreplicated design to serve as a means of nominating candidates for subsequent validation.…”
Section: Discussionmentioning
confidence: 99%
“…It is also important to examine biological replicates, ideally from individuals from different strains to test the possibility of strain-specific effects. A much larger study, with a well-replicated and blocked design of multiple RNA-seq runs (Auer and Doerge 2010) would be needed to generate a definitive count of the number of imprinted genes. From our data, 4.5% (251) of the 5527 genes, having sufficient data to perform the test, exhibit significant imprinting in the placenta.…”
Section: Discussionmentioning
confidence: 99%
“…Because of the still significant cost of sequencing, and the enormous amount of data generated, observational studies with no biological replication are common, but are clearly prone to misinterpretation. The reader is referred to [33] for a discussion of the best experimental designs for meaningful comparisons of RNAseq datasets; the principles are fundamentally similar to those described above for field and greenhouse plot design.…”
Section: Guidelines For Biochemical and Molecular Biological Experimentsmentioning
confidence: 99%
“…Several R packages have been developed for statistical testing for DE using RNA-Seq data [38] . Those include edgeR [39] , DESeq [40] , DEGSeq [41] , baySeq [42] , BBSeq [43] , TSMP [44] , NBPSeq [45] and PoissonSeq [46] . Additionally, databases like SEQC (SEquencing Quality Control) have been established to assess the performance of the NGS technologies.…”
Section: Advantages and Challengesmentioning
confidence: 99%