After Alzheimer, Parkinson’s disease (PD) is the second most common neurodegenerative disorder. Alpha synuclein (SNCA) is deemed as a major component of Lewy bodies, a neuropathological feature of PD. Five point mutations in SNCA have been reported so far, responsible for autosomal dominant PD. This study aims to decipher evolutionary and structural insights of SNCA by revealing its sequence and structural evolutionary patterns among sarcopterygians and its paralogous counterparts (SNCB and SNCG). Rate analysis detected strong purifying selection on entire synuclein family. Structural dynamics divulges that during the course of sarcopterygian evolutionary history, the region encompassed 32 to 58 of N-terminal domain of SNCA has acquired its critical functional significance through the epistatic influence of the lineage specific substitutions. In sum, these findings provide an evidence that the region from 32 to 58 of N-terminal lipid binding alpha helix domain of SNCA is the most critical region, not only from the evolutionary perspective but also for the stability and the proper conformation of the protein as well as crucial for the disease pathogenesis, harboring critical interaction sites.
The zinc-finger transcription factor GLI3 is a key regulator of development, acting as a primary transducer of Sonic hedgehog (SHH) signaling in a combinatorial context dependent fashion controlling multiple patterning steps in different tissues/organs. A tight temporal and spatial control of gene expression is indispensable, however, cis-acting sequence elements regulating GLI3 expression have not yet been reported. We show that 11 ancient genomic DNA signatures, conserved from the pufferfish Takifugu (Fugu) rubripes to man, are distributed throughout the introns of human GLI3. They map within larger conserved non-coding elements (CNEs) that are found in the tetrapod lineage. Full length CNEs transiently transfected into human cell cultures acted as cell type specific enhancers of gene transcription. The regulatory potential of these elements is conserved and was exploited to direct tissue specific expression of a reporter gene in zebrafish embryos. Assays of deletion constructs revealed that the human-Fugu conserved sequences within the GLI3 intronic CNEs were essential but not sufficient for full-scale transcriptional activation. The enhancer activity of the CNEs is determined by a combinatorial effect of a core sequence conserved between human and teleosts (Fugu) and flanking tetrapod-specific sequences, suggesting that successive clustering of sequences with regulatory potential around an ancient, highly conserved nucleus might be a possible mechanism for the evolution of cis-acting regulatory elements.
The National Genomics Data Center (NGDC), part of the China National Center for Bioinformation (CNCB), provides a suite of database resources to support worldwide research activities in both academia and industry. With the explosive growth of multi-omics data, CNCB-NGDC is continually expanding, updating and enriching its core database resources through big data deposition, integration and translation. In the past year, considerable efforts have been devoted to 2019nCoVR, a newly established resource providing a global landscape of SARS-CoV-2 genomic sequences, variants, and haplotypes, as well as Aging Atlas, BrainBase, GTDB (Glycosyltransferases Database), LncExpDB, and TransCirc (Translation potential for circular RNAs). Meanwhile, a series of resources have been updated and improved, including BioProject, BioSample, GWH (Genome Warehouse), GVM (Genome Variation Map), GEN (Gene Expression Nebulas) as well as several biodiversity and plant resources. Particularly, BIG Search, a scalable, one-stop, cross-database search engine, has been significantly updated by providing easy access to a large number of internal and external biological resources from CNCB-NGDC, our partners, EBI and NCBI. All of these resources along with their services are publicly accessible at https://bigd.big.ac.cn.
The National Genomics Data Center (NGDC) provides a suite of database resources to support worldwide research activities in both academia and industry. With the rapid advancements in higher-throughput and lower-cost sequencing technologies and accordingly the huge volume of multi-omics data generated at exponential scales and rates, NGDC is continually expanding, updating and enriching its core database resources through big data integration and value-added curation. In the past year, efforts for update have been mainly devoted to BioProject, BioSample, GSA, GWH, GVM, NONCODE, LncBook, EWAS Atlas and IC4R. Newly released resources include three human genome databases (PGG.SNV, PGG.Han and CGVD), eLMSG, EWAS Data Hub, GWAS Atlas, iSheep and PADS Arsenal. In addition, four web services, namely, eGPS Cloud, BIG Search, BIG Submission and BIG SSO, have been significantly improved and enhanced. All of these resources along with their services are publicly accessible at https://bigd.big.ac.cn.
BackgroundThe zinc-finger transcription factor GLI3 is an important mediator of Sonic hedgehog signaling and crucial for patterning of many aspects of the vertebrate body plan. In vertebrates, the mechanism of SHH signal transduction and its action on target genes by means of activating or repressing forms of GLI3 have been studied most extensively during limb development and the specification of the central nervous system. From these studies it has emerged, that Gli3 expression must be subject to a tight spatiotemporal regulation. However, the genetic mechanisms and the cis-acting elements controlling the expression of Gli3 remained largely unknown.ResultsHere, we demonstrate in chicken and mouse transgenic embryos that human GLI3-intronic conserved non-coding sequence elements (CNEs) autonomously control individual aspects of Gli3 expression. Their combined action shows many aspects of a Gli3-specific pattern of transcriptional activity. In the mouse limb bud, different CNEs enhance Gli3-specific expression in evolutionary ancient stylopod and zeugopod versus modern skeletal structures of the autopod. Limb bud specificity is also found in chicken but had not been detected in zebrafish embryos. Three of these elements govern central nervous system specific gene expression during mouse embryogenesis, each targeting a subset of endogenous Gli3 transcription sites. Even though fish, birds, and mammals share an ancient repertoire of gene regulatory elements within Gli3, the functions of individual enhancers from this catalog have diverged significantly. During evolution, ancient broad-range regulatory elements within Gli3 attained higher specificity, critical for patterning of more specialized structures, by abolishing the potential for redundant expression control.ConclusionThese results not only demonstrate the high level of complexity in the genetic mechanisms controlling Gli3 expression, but also reveal the evolutionary significance of cis-acting regulatory networks of early developmental regulators in vertebrates.
The zinc-finger transcription factor GLI3 acts during vertebrate development in a combinatorial, contextdependent fashion as a primary transducer of sonic hedgehog (SHH) signaling. In humans, mutations affecting this key regulator of development are associated with GLI3-morphopathies, a group of congenital malformations in which forebrain and limb development are preferentially affected. We show that a noncoding element from intron two of GLI3, ultraconserved in mammals and highly conserved in the pufferfish Fugu, is a transcriptional enhancer. In transient transfection assays, it activates reporter gene transcription in human cell cultures expressing endogenous GLI3 but not in GLI3 negative cells. The identified enhancer element is predicted to contain conserved binding sites for transcription factors crucial for developmental steps in which GLI3 is involved. The regulatory potential of this element is conserved and was used to direct tissue-specific expression of a green fluorescent protein reporter gene in zebrafish embryos and of a beta-galactosidase reporter in transgenic mouse embryos. Time, location, and quantity of reporter gene expression are congruent with part of the pattern previously reported for endogenous GLI3 transcription.
The outbreak of coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is spreading fast worldwide. There is a pressing need to understand how the virus counteracts host innate immune responses. Deleterious clinical manifestations of coronaviruses have been associated with virus-induced direct dysregulation of innate immune responses occurring via viral macrodomains located within nonstructural protein-3 (Nsp3). However, no substantial information is available concerning the relationship of macrodomains to the unusually high pathogenicity of SARS-CoV-2. Here, we show that structural evolution of macrodomains may impart a critical role to the unique pathogenicity of SARS-CoV-2. Using sequence, structural, and phylogenetic analysis, we identify a specific set of historical substitutions that recapitulate the evolution of the macrodomains that counteract host immune response. These evolutionary substitutions may alter and reposition the secondary structural elements to create new intra-protein contacts and, thereby, may enhance the ability of SARS-CoV-2 to inhibit host immunity. Further, we find that the unusual virulence of this virus is potentially the consequence of Darwinian selection‐driven epistasis in protein evolution. Our findings warrant further characterization of macrodomain-specific evolutionary substitutions in in vitro and in vivo models to determine their inhibitory effects on the host immune system.
Motivation The significance of long non-coding RNAs (lncRNAs) in many biological processes and diseases has gained intense interests over the past several years. However, computational identification of lncRNAs in a wide range of species remains challenging; it requires prior knowledge of well-established sequences and annotations or species-specific training data, but the reality is that only a limited number of species have high-quality sequences and annotations. Results Here we first characterize lncRNAs in contrast to protein-coding RNAs based on feature relationship and find that the feature relationship between open reading frame length and guanine-cytosine (GC) content presents universally substantial divergence in lncRNAs and protein-coding RNAs, as observed in a broad variety of species. Based on the feature relationship, accordingly, we further present LGC, a novel algorithm for identifying lncRNAs that is able to accurately distinguish lncRNAs from protein-coding RNAs in a cross-species manner without any prior knowledge. As validated on large-scale empirical datasets, comparative results show that LGC outperforms existing algorithms by achieving higher accuracy, well-balanced sensitivity and specificity, and is robustly effective (>90% accuracy) in discriminating lncRNAs from protein-coding RNAs across diverse species that range from plants to mammals. To our knowledge, this study, for the first time, differentially characterizes lncRNAs and protein-coding RNAs based on feature relationship, which is further applied in computational identification of lncRNAs. Taken together, our study represents a significant advance in characterization and identification of lncRNAs and LGC thus bears broad potential utility for computational analysis of lncRNAs in a wide range of species. Availability and implementation LGC web server is publicly available at http://bigd.big.ac.cn/lgc/calculator. The scripts and data can be downloaded at http://bigd.big.ac.cn/biocode/tools/BT000004. Supplementary information Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.