2020
DOI: 10.1101/2020.07.10.195636
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network

Abstract: Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of Transcription Start Sites (TSSs) in several species. Strikingly, ∼72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probed these unassigned TSSs and showed that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs).… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 67 publications
0
3
0
Order By: Relevance
“…pSTR gene-regulatory effects pSTRs can affect variable phenotypes and disease susceptibility through their gene-regulatory effects 3,12 . Previous genome-wide searches have identified thousands of pSTRs associated with human gene expression [13][14][15][16][17][18] , but our understanding of this landscape remains incomplete. Moreover, the direct exploration of pSTRs associated with gene posttranscriptional regulation has not been performed, although several single-gene studies have reported that pSTRs can modulate RNA structure and function 3 .…”
Section: Pstr Functional Propertiesmentioning
confidence: 99%
See 1 more Smart Citation
“…pSTR gene-regulatory effects pSTRs can affect variable phenotypes and disease susceptibility through their gene-regulatory effects 3,12 . Previous genome-wide searches have identified thousands of pSTRs associated with human gene expression [13][14][15][16][17][18] , but our understanding of this landscape remains incomplete. Moreover, the direct exploration of pSTRs associated with gene posttranscriptional regulation has not been performed, although several single-gene studies have reported that pSTRs can modulate RNA structure and function 3 .…”
Section: Pstr Functional Propertiesmentioning
confidence: 99%
“…After the seminal discovery that expansions of CGG repeats in the FMR1 gene were linked to fragile X syndrome (FXS) in 1991 [7][8][9][10] , researchers have identified approximately 60 STR loci implicated in a range of Mendelian diseases to date, including ataxias, amyotrophic lateral sclerosis, Huntington disease, frontotemporal dementia, and various neurological disorders 11,12 . More importantly, although our view remains incomplete, emerging evidence has shown that a significant number of polymorphic STRs (pSTRs) can modulate various molecular and cellular processes, such as DNA methylation 13 , gene expression [13][14][15][16][17][18] , and alternative splicing [19][20][21][22] , suggesting that pSTRs may contribute to complex phenotypes 3,23 .…”
mentioning
confidence: 99%
“…A salient feature of the CorrB non-TE RepSeqs is their predicted propensity to adopt a non-B DNA structure: (i) CorrB simple repeat sequences consist in alternating purine and pyrimidine, with only C (or G) on one strand when not made only of A and T, which has a predicted capacity to adopt a Z-DNA conformation (40,41); (ii) such A-rich and AT-rich low complexity sequences can form triplexes with RNAs via Hoogsteen pairing, in particular with the lncRNA KCNQ1OT1 (42,43) (Table 1). In addition, simple repeats are commonly transcribed and high GC skewing of CorrB non-TE RepSeq suggests that they are prone to forming R-loops upon transcription (44)(45)(46). Simple repeats are known to specifically bind a host of TFs (47).…”
Section: Identifying Cis-determinants Of A/b Partitioning: Proa and P...mentioning
confidence: 99%
“…The CapTrap-seq protocol (Figure 1A) builds upon the previously established Cap-trapping approach [25][26][27][28] , but with specific optimizations for long-read RNA sequencing. The protocol begins with the enrichment of polyadenylated transcripts using the anchored oligo(dT) method for cDNA synthesis (Anchored dT and PolyA+ in Figure 1A).…”
Section: Captrap-seq and Lyric For Full-length Transcript Identificationmentioning
confidence: 99%
“…Here, we introduce CapTrap-seq, a method that combines the Cap-trapping strategy [25][26][27][28] with oligo(dT) priming to detect 5'capped full-length transcripts. We also present LyRic, a bioinformatics workflow for transcript identification using long-read RNA-seq data.…”
Section: Introductionmentioning
confidence: 99%