Dmitry Shtokalo scite author profile

The accurate and thorough genome-wide detection of adenosine-to-inosine editing, a biologically indispensable process, has proven challenging. Here, we present a discovery pipeline in adult Drosophila, with 3,581 high-confidence editing sites identified with an estimated accuracy of 87%. The target genes and specific sites highlight global biological properties and functions of RNA editing, including hitherto-unknown editing in well-characterized classes of noncoding RNAs and 645 sites that cause amino acid substitutions, usually at conserved positions. The spectrum of functions that these gene targets encompass suggests that editing participates in a diverse set of cellular processes. Editing sites in Drosophila exhibit sequence-motif preferences and tend to be concentrated within a small subset of total RNAs. Finally, editing regulates expression levels of target mRNAs and strongly correlates with alternative splicing.

show abstract

VlincRNAs controlled by retroviral elements are a hallmark of pluripotency and cancer

Laurent

et al. 2013

View full text Add to dashboard Cite

BackgroundThe function of the non-coding portion of the human genome remains one of the most important questions of our time. Its vast complexity is exemplified by the recent identification of an unusual and notable component of the transcriptome - very long intergenic non-coding RNAs, termed vlincRNAs.ResultsHere we identify 2,147 vlincRNAs covering 10 percent of our genome. We show they are present not only in cancerous cells, but also in primary cells and normal human tissues, and are controlled by canonical promoters. Furthermore, vlincRNA promoters frequently originate from within endogenous retroviral sequences. Strikingly, the number of vlincRNAs expressed from endogenous retroviral promoters strongly correlates with pluripotency or the degree of malignant transformation. These results suggest a previously unknown connection between the pluripotent state and cancer via retroviral repeat-driven expression of vlincRNAs. Finally, we show that vlincRNAs can be syntenically conserved in humans and mouse and their depletion using RNAi can cause apoptosis in cancerous cells.ConclusionsThese intriguing observations suggest that vlincRNAs could create a framework that combines many existing short ESTs and lincRNAs into a landscape of very long transcripts functioning in the regulation of gene expression in the nucleus. Certain types of vlincRNAs participate at specific stages of normal development and, based on analysis of a limited set of cancerous and primary cell lines, they appear to be co-opted by cancer-associated transcriptional programs. This provides additional understanding of transcriptome regulation during the malignant state, and could lead to additional targets and options for its reversal.

show abstract

Intronic RNAs constitute the major fraction of the non-coding RNA in mammalian cells

et al. 2012

View full text Add to dashboard Cite

BackgroundThe function of RNA from the non-coding (the so called “dark matter”) regions of the genome has been a subject of considerable recent debate. Perhaps the most controversy is regarding the function of RNAs found in introns of annotated transcripts, where most of the reads that map outside of exons are usually found. However, it has been reported that the levels of RNA in introns are minor relative to those of the corresponding exons, and that changes in the levels of intronic RNAs correlate tightly with that of adjacent exons. This would suggest that RNAs produced from the vast expanse of intronic space are just pieces of pre-mRNAs or excised introns en route to degradation.ResultsWe present data that challenges the notion that intronic RNAs are mere by-standers in the cell. By performing a highly quantitative RNAseq analysis of transcriptome changes during an inflammation time course, we show that intronic RNAs have a number of features that would be expected from functional, standalone RNA species. We show that there are thousands of introns in the mouse genome that generate RNAs whose overall abundance, which changes throughout the inflammation timecourse, and other properties suggest that they function in yet unknown ways.ConclusionsSo far, the focus of non-coding RNA discovery has shied away from intronic regions as those were believed to simply encode parts of pre-mRNAs. Results presented here suggest a very different situation – the sequences encoded in the introns appear to harbor a yet unexplored reservoir of novel, functional RNAs. As such, they should not be ignored in surveys of functional transcripts or other genomic studies.

show abstract

Composite Module Analyst: identification of transcription factor binding site combinations using genetic algorithm

Waleev¹,

Shtokalo²,

Konovalova³

et al. 2006

Nucleic Acids Research

View full text Add to dashboard Cite

Composite Module Analyst (CMA) is a novel software tool aiming to identify promoter-enhancer models based on the composition of transcription factor (TF) binding sites and their pairs. CMA is closely interconnected with the TRANSFAC® database. In particular, CMA uses the positional weight matrix (PWM) library collected in TRANSFAC® and therefore provides the possibility to search for a large variety of different TF binding sites. We model the structure of the long gene regulatory regions by a Boolean function that joins several local modules, each consisting of co-localized TF binding sites. Having as an input a set of co-regulated genes, CMA builds the promoter model and optimizes the parameters of the model automatically by applying a genetic-regression algorithm. We use a multicomponent fitness function of the algorithm which includes several statistical criteria in a weighted linear function. We show examples of successful application of CMA to a microarray data on transcription profiling of TNF-alpha stimulated primary human endothelial cells. The CMA web server is freely accessible at . An advanced version of CMA is also a part of the commercial system ExPlain™ () designed for causal analysis of gene expression data.

show abstract

Functional annotation of the vlinc class of non-coding RNAs using systems biology approach

et al. 2016

View full text Add to dashboard Cite

Functionality of the non-coding transcripts encoded by the human genome is the coveted goal of the modern genomics research. While commonly relied on the classical methods of forward genetics, integration of different genomics datasets in a global Systems Biology fashion presents a more productive avenue of achieving this very complex aim. Here we report application of a Systems Biology-based approach to dissect functionality of a newly identified vast class of very long intergenic non-coding (vlinc) RNAs. Using highly quantitative FANTOM5 CAGE dataset, we show that these RNAs could be grouped into 1542 novel human genes based on analysis of insulators that we show here indeed function as genomic barrier elements. We show that vlincRNAs genes likely function in cis to activate nearby genes. This effect while most pronounced in closely spaced vlincRNA–gene pairs can be detected over relatively large genomic distances. Furthermore, we identified 101 vlincRNA genes likely involved in early embryogenesis based on patterns of their expression and regulation. We also found another 109 such genes potentially involved in cellular functions also happening at early stages of development such as proliferation, migration and apoptosis. Overall, we show that Systems Biology-based methods have great promise for functional annotation of non-coding RNAs.

show abstract

On the importance of small changes in RNA expression

Laurent

Shtokalo

Tackett

et al. 2013

Methods

View full text Add to dashboard Cite

Deep Sequencing Transcriptome Analysis of Murine Wound Healing: Effects of a Multicomponent, Multitarget Natural Product Therapy-Tr14

Laurent

Seilheimer

Tackett

et al. 2017

Front. Mol. Biosci.

View full text Add to dashboard Cite

Wound healing involves an orchestrated response that engages multiple processes, such as hemostasis, cellular migration, extracellular matrix synthesis, and in particular, inflammation. Using a murine model of cutaneous wound repair, the transcriptome was mapped from 12 h to 8 days post-injury, and in response to a multicomponent, multi-target natural product, Tr14. Using single-molecule RNA sequencing (RNA-seq), there were clear temporal changes in known transcripts related to wound healing pathways, and additional novel transcripts of both coding and non-coding genes. Tr14 treatment modulated >100 transcripts related to key wound repair pathways, such as response to wounding, wound contraction, and cytokine response. The results provide the most precise and comprehensive characterization to date of the transcriptome's response to skin damage, repair, and multicomponent natural product therapy. By understanding the wound repair process, and the effects of natural products, it should be possible to intervene more effectively in diseases involving aberrant repair.

show abstract

Style transfer with variational autoencoders is a promising approach to RNA-Seq data harmonization and analysis

Russkikh

Антонец

Shtokalo

et al. 2020

View full text Add to dashboard Cite

Motivation The transcriptomic data is being frequently used in the research of biomarker genes of different diseases and biological states. The most common tasks there are data harmonization and treatment outcome prediction. Both of them can be addressed via the style transfer approach. Either technical factors or any biological details about the samples which we would like to control (gender, biological state, treatment etc.) can be used as style components. Results The proposed style transfer solution is based on Conditional Variational Autoencoders, Y-Autoencoders and adversarial feature decomposition. In order to quantitatively measure the quality of the style transfer, neural network classifiers which predict the style and semantics after training on real expression were used. Comparison with several existing style-transfer based approaches shows that proposed model has the highest style prediction accuracy on all considered datasets while having comparable or the best semantics prediction accuracy. Availability https://github.com/NRshka/stvae-source Supplementary information FigShare.com (https://dx.doi.org/10.6084/m9.figshare.9925115)

show abstract

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Dmitry Shtokalo

Genome-wide analysis of A-to-I RNA editing by single-molecule sequencing in Drosophila

VlincRNAs controlled by retroviral elements are a hallmark of pluripotency and cancer

Intronic RNAs constitute the major fraction of the non-coding RNA in mammalian cells

Composite Module Analyst: identification of transcription factor binding site combinations using genetic algorithm

Functional annotation of the vlinc class of non-coding RNAs using systems biology approach

On the importance of small changes in RNA expression

Deep Sequencing Transcriptome Analysis of Murine Wound Healing: Effects of a Multicomponent, Multitarget Natural Product Therapy-Tr14

Style transfer with variational autoencoders is a promising approach to RNA-Seq data harmonization and analysis

Contact Info

Product

Resources

About