2013
DOI: 10.1093/bioinformatics/btt350
|View full text |Cite
|
Sign up to set email alerts
|

Data-based filtering for replicated high-throughput transcriptome sequencing experiments

Abstract: Motivation: RNA sequencing is now widely performed to study differential expression among experimental conditions. As tests are performed on a large number of genes, stringent false-discovery rate control is required at the expense of detection power. Ad hoc filtering techniques are regularly used to moderate this correction by removing genes with low signal, with little attention paid to their impact on downstream analyses.Results: We propose a data-driven method based on the Jaccard similarity index to calcu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
196
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
10

Relationship

0
10

Authors

Journals

citations
Cited by 197 publications
(197 citation statements)
references
References 28 publications
1
196
0
Order By: Relevance
“…About 247 million read pairs were unambiguously mapped on the full M. truncatula genome sequence Mt20120830 (accessible at the SYMbiMICS Web site https://iant.toulouse.inra.fr/symbimics/; download section), with an average of 61.9 million read pairs per condition (Supplemental Table S1). Using a threshold statistically defined following Rau et al (2013), these data enabled the detection of 17,191 genes, which included MtEXP7, a gene specifically expressed in the epidermis . Complete results, including correspondence with Mt4.0 gene models and annotation data, are provided in Supplemental Table S2 or at the SYMbiMICS Web site, where epidermis and previous nodule RNAseq data can be easily queried using various requests (M. truncatula gene or Affymetrix oligonucleotide identifiers, keywords, and BLAST search).…”
Section: Lcm-rnaseq Analysis Of the Root Epidermal Response To Nf Trementioning
confidence: 99%
“…About 247 million read pairs were unambiguously mapped on the full M. truncatula genome sequence Mt20120830 (accessible at the SYMbiMICS Web site https://iant.toulouse.inra.fr/symbimics/; download section), with an average of 61.9 million read pairs per condition (Supplemental Table S1). Using a threshold statistically defined following Rau et al (2013), these data enabled the detection of 17,191 genes, which included MtEXP7, a gene specifically expressed in the epidermis . Complete results, including correspondence with Mt4.0 gene models and annotation data, are provided in Supplemental Table S2 or at the SYMbiMICS Web site, where epidermis and previous nodule RNAseq data can be easily queried using various requests (M. truncatula gene or Affymetrix oligonucleotide identifiers, keywords, and BLAST search).…”
Section: Lcm-rnaseq Analysis Of the Root Epidermal Response To Nf Trementioning
confidence: 99%
“…We removed the genes with zero counts in all conditions, as well as genes whose maximum counts are <5 as recommended (Rau et al 2013). The description of parameters for these data sets is summarized in Table 1.…”
Section: Estimation Of Parameters In the Data Setsmentioning
confidence: 99%
“…Aspects of transcriptomic experimental design and data analysis have been questioned in the past. Early transcriptomic studies were hampered by inappropriate statistical analyses, a problem that was resolved through the development of statistical methods specific to transcriptomic datasets (Storey and Tibshirani, 2003;Hackstadt and Hess, 2009;Rau et al, 2013). Analyzing long lists of differentially expressed genes also posed a challenge for early transcriptomic experiments (Gracey, 2007).…”
Section: Repeatedly Identifying the Same Genesmentioning
confidence: 99%