2018
DOI: 10.1101/352823
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Covering all your bases: incorporating intron signal from RNA-seq data

Abstract: RNA-seq datasets can contain millions of intron reads per sequenced library that are typically removed from downstream analysis. Only reads overlapping annotated exons are considered to be informative since mature mRNA is assumed to be the major component sequenced, especially when examining poly(A) RNA samples. By examining multiple datasets, we demonstrate that intron reads are informative and that signal is shared between exon and intron counts. The majority of expressed genes contain reads in both exon and… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
16
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
4

Relationship

1
7

Authors

Journals

citations
Cited by 11 publications
(16 citation statements)
references
References 53 publications
0
16
0
Order By: Relevance
“…Current IR detection methods could be improved through integrating prior knowledge, selecting suitable thresholds for parameters, and so on. For prior knowledge, features, such as intron length, the distribution of the splicing regulatory elements, canonical or non-canonical status of splice sites, and splicing strength could be used as prior knowledge to improve IR detection (Mao et al, 2014;Cui et al, 2017;Kim et al, 2018;Zhang et al, 2018). For parameter thresholds, designing methods to incorporate sequence features and read coverage variations of introns to adaptively determine individual intronspecific optimal thresholds of parameters could be helpful for IR detection (Broseus and Ritchie, 2020).…”
Section: Resultsmentioning
confidence: 99%
“…Current IR detection methods could be improved through integrating prior knowledge, selecting suitable thresholds for parameters, and so on. For prior knowledge, features, such as intron length, the distribution of the splicing regulatory elements, canonical or non-canonical status of splice sites, and splicing strength could be used as prior knowledge to improve IR detection (Mao et al, 2014;Cui et al, 2017;Kim et al, 2018;Zhang et al, 2018). For parameter thresholds, designing methods to incorporate sequence features and read coverage variations of introns to adaptively determine individual intronspecific optimal thresholds of parameters could be helpful for IR detection (Broseus and Ritchie, 2020).…”
Section: Resultsmentioning
confidence: 99%
“…About 8,000 genes are detected only by exonic reads,~8,000 by exonic and intronic reads, and 4,000 by intronic reads only (Figure 2B, Table S1). Intronic reads correlate well with exonic reads of the same gene in scRNA-seq [45] and bulk RNA-seq data sets [46] and intronic reads 7 . CC-BY-NC-ND 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.…”
mentioning
confidence: 88%
“…Some mitigate challenge (1) by ignoring from consideration any intronic regions that overlap other features (KMA, IntEREst, iREAD), leaving biological blindspots in RI detection [37–39]. Some attempt to mitigate challenge (2) by recommending that a user provides poly(A)-selected data as their input [13, 37, 39, 40], assuming that poly(A) selected data represents fully processed, mature RNA. However, poly(A) selection during library preparation has been shown not to remove all immature post-transcriptionally spliced RNA molecules, and intronic sequences are commonly found in poly(A)-selected RNA-sequencing data [42, 43].…”
Section: Introductionmentioning
confidence: 99%