Small non-coding RNAs (ncRNAs) are short non-coding sequences involved in gene regulation in many biological processes and diseases. The lack of a complete comprehension of their biological functionality, especially in a genome-wide scenario, has demanded new computational approaches to annotate their roles. It is widely known that secondary structure is determinant to know RNA function and machine learning based approaches have been successfully proven to predict RNA function from secondary structure information. Here we show that RNA function can be predicted with good accuracy from a lightweight representation of sequence information without the necessity of computing secondary structure features which is computationally expensive. This finding appears to go against the dogma of secondary structure being a key determinant of function in RNA. Compared to recent secondary structure based methods, the proposed solution is more robust to sequence boundary noise and reduces drastically the computational cost allowing for large data volume annotations. Scripts and datasets to reproduce the results of experiments proposed in this study are available at: https://github.com/bioinformatics-sannio/ncrna-deep.
Embryonic stem cells (ESCs) are derived from inner cell mass (ICM) of the blastocyst. In serum/LIF culture condition, they show variable expression of pluripotency genes that mark cell fluctuation between pluripotency and differentiation metastate. The ESCs subpopulation marked by zygotic genome activation gene (ZGA) signature, including Zscan4, retains a wider differentiation potency than epiblast-derived ESCs. We have recently shown that retinoic acid (RA) significantly enhances Zscan4 cell population. However, it remains unexplored how RA initiates the ESCs to 2-cell like reprogramming. Here we found that RA is decisive for ESCs to 2C-like cell transition, and reconstructed the gene network surrounding Zscan4. We revealed that RA regulates 2C-like population co-activating Dux and Duxbl1. We provided novel evidence that RA dependent ESCs to 2C-like cell transition is regulated by Dux, and antagonized by Duxbl1. Our suggested mechanism could shed light on the role of RA on ESC reprogramming.
BackgroundLong non-coding RNAs (lncRNAs) represent a novel class of non-coding RNAs having a crucial role in many biological processes. The identification of long non-coding homologs among different species is essential to investigate such roles in model organisms as homologous genes tend to retain similar molecular and biological functions. Alignment–based metrics are able to effectively capture the conservation of transcribed coding sequences and then the homology of protein coding genes. However, unlike protein coding genes the poor sequence conservation of long non-coding genes makes the identification of their homologs a challenging task.ResultsIn this study we compare alignment–based and alignment–free string similarity metrics and look at promoter regions as a possible source of conserved information. We show that promoter regions encode relevant information for the conservation of long non-coding genes across species and that such information is better captured by alignment–free metrics. We perform a genome wide test of this hypothesis in human, mouse, and zebrafish.ConclusionsThe obtained results persuaded us to postulate the new hypothesis that, unlike protein coding genes, long non-coding genes tend to preserve their regulatory machinery rather than their transcribed sequence. All datasets, scripts, and the prediction tools adopted in this study are available at https://github.com/bioinformatics-sannio/lncrna-homologs.Electronic supplementary materialThe online version of this article (10.1186/s12859-018-2441-6) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.