2015
DOI: 10.12688/f1000research.7563.1
|View full text |Cite
|
Sign up to set email alerts
|

Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences

Abstract: High-throughput sequencing of cDNA (RNA-seq) is used extensively to characterize the transcriptome of cells. Many transcriptomic studies aim at comparing either abundance levels or the transcriptome composition between given conditions, and as a first step, the sequencing reads must be used as the basis for abundance quantification of transcriptomic features of interest, such as genes or transcripts. Several different quantification approaches have been proposed, ranging from simple counting of reads that over… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

4
1,019
0
1

Year Published

2016
2016
2023
2023

Publication Types

Select...
10

Relationship

0
10

Authors

Journals

citations
Cited by 2,615 publications
(1,024 citation statements)
references
References 34 publications
4
1,019
0
1
Order By: Relevance
“…A set of quasitranscripts was built for the genome and the eGFP, zeocin resistance, and pUC-origin sequences from the plasmids added. Transcripts then were quantified with Salmon and tximport (64), and the edgeR (65) package (R 3.3.3 software) was used to import and normalize the counts across the data sets in R. This gave a transcripts per million (TPM) value for each gene which should, assuming an even mapping distribution across the genome, be equivalent to one copy for most genes and then can be compared against the levels of the plasmid genes to see if they are present at a higher copy number. More details about how we calculated copy number are given in Data Set S7 in the supplemental material.…”
Section: Methodsmentioning
confidence: 99%
“…A set of quasitranscripts was built for the genome and the eGFP, zeocin resistance, and pUC-origin sequences from the plasmids added. Transcripts then were quantified with Salmon and tximport (64), and the edgeR (65) package (R 3.3.3 software) was used to import and normalize the counts across the data sets in R. This gave a transcripts per million (TPM) value for each gene which should, assuming an even mapping distribution across the genome, be equivalent to one copy for most genes and then can be compared against the levels of the plasmid genes to see if they are present at a higher copy number. More details about how we calculated copy number are given in Data Set S7 in the supplemental material.…”
Section: Methodsmentioning
confidence: 99%
“…For Sailfish and Salmon, outputs were converted to a Sleuth-ready format using wasabi [55]. For Kallisto, Sailfish, Salmon, and BitSeq, transcript-level values were condensed to gene-level values using tximport prior to evaluating gene-level differential expression [56]. For all differential expression analyses performed at the transcript-level, significant transcripts were converted to the corresponding gene for performance evaluation, such that if a single transcript was called as differentially expressed, the corresponding gene was also called differentially expressed.…”
Section: Methodsmentioning
confidence: 99%
“…Remaining reads were pseudoaligned to the human transcriptome (as per GRCh38.84 GTF file downloaded from Ensembl on 08/03/2016) and transcript abundances quantified using Kallisto (v0.44.0) 41 . The tximport R package (v1.8.0) 42 was then used to summarise transcript abundances to the gene level.…”
Section: Methodsmentioning
confidence: 99%