DNA methylation of transcription units (gene bodies) occurs in the genomes of many animal and plant species. Phylogenetic persistence of gene body methylation implies biological significance; yet, the functional roles of gene body methylation remain elusive. In this study, we analyzed methylation levels of orthologs from four distantly related invertebrate species, including the honeybee, silkworm, sea squirt, and sea anemone. We demonstrate that in all four species, gene bodies distinctively cluster to two groups, which correspond to high and low methylation levels. This pattern resembles that of sequence composition arising from the mutagenetic effect of DNA methylation. In spite of this effect, our results show that protein sequences of genes targeted by high levels of methylation are conserved relative to genes lacking methylation. Our investigation identified many genes that either gained or lost methylation during the course of invertebrate evolution. Most of these genes appear to have lost methylation in the insect lineages we investigated, particularly in the honeybee. We found that genes that are methylated in all four invertebrate taxa are enriched for housekeeping functions related to transcription and translation, whereas the loss of DNA methylation occurred in genes whose functions include cellular signaling and reproductive processes. Overall, our study helps to illuminate the functional significance of gene body methylation and its impacts on genome evolution in diverse invertebrate taxa.
DNA methylation at the promoter of a gene is presumed to render it silent, yet a sizable fraction of genes with methylated proximal promoters exhibit elevated expression. Here, we show, through extensive analysis of the methylome and transcriptome in 34 tissues, that in many such cases, transcription is initiated by a distal upstream CpG island (CGI) located several kilobases away that functions as an alternative promoter. Specifically, such genes are expressed precisely when the neighboring CGI is unmethylated but remain silenced otherwise. Based on CAGE and Pol II localization data, we found strong evidence of transcription initiation at the upstream CGI and a lack thereof at the methylated proximal promoter itself. Consistent with their alternative promoter activity, CGI-initiated transcripts are associated with signals of stable elongation and splicing that extend into the gene body, as evidenced by tissue-specific RNA-seq and other DNA-encoded splice signals. Furthermore, based on both inter- and intra-species analyses, such CGIs were found to be under greater purifying selection relative to CGIs upstream of silenced genes. Overall, our study describes a hitherto unreported conserved mechanism of transcription of genes with methylated proximal promoters in a tissue-specific fashion. Importantly, this phenomenon explains the aberrant expression patterns of some cancer driver genes, potentially due to aberrant hypomethylation of distal CGIs, despite methylation at proximal promoters.
After the initial enthusiasm of the human genome project, it became clear that without additional data pertaining to the epigenome, i.e., how the genome is marked at specific developmental periods, in different tissues, as well as across individuals and species-the promise of the genome sequencing project in understanding biology cannot be fulfilled. This realization prompted several large-scale efforts to map the epigenome, most notably the Encyclopedia of DNA Elements (ENCODE) project. While there is essentially a single genome in an individual, there are hundreds of epigenomes, corresponding to various types of epigenomic marks at different developmental times and in multiple tissue types. Unprecedented advances in next-generation sequencing (NGS) technologies, by virtue of low cost and high speeds that continue to improve at a rate beyond what is anticipated by Moore's law for computer hardware technologies, have revolutionized molecular biology and genetics research, and have in turn prompted innovative ways to reduce the problem of measuring cellular events involving DNA or RNA into a sequencing problem. In this article, we provide a brief overview of the epigenome, the various types of epigenomic data afforded by NGS, and some of the novel discoveries yielded by the epigenomics projects. We also provide ample references for the reader to get in-depth information on these topics.
The "developmental hourglass'' describes a pattern of increasing morphological divergence towards earlier and later embryonic development, separated by a period of significant conservation across distant species (the "phylotypic stage''). Recent studies have found evidence in support of the hourglass effect at the genomic level. For instance, the phylotypic stage expresses the oldest and most conserved transcriptomes. However, the regulatory mechanism that causes the hourglass pattern remains an open question. Here, we use an evolutionary model of regulatory gene interactions during development to identify the conditions under which the hourglass effect can emerge in a general setting. The model focuses on the hierarchical gene regulatory network that controls the developmental process, and on the evolution of a population under random perturbations in the structure of that network. The model predicts, under fairly general assumptions, the emergence of an hourglass pattern in the structure of a temporal representation of the underlying gene regulatory network. The evolutionary age of the corresponding genes also follows an hourglass pattern, with the oldest genes concentrated at the hourglass waist. The key behind the hourglass effect is that developmental regulators should have an increasingly specific function as development progresses. Analysis of developmental gene expression profiles from Drosophila melanogaster and Arabidopsis thaliana provide consistent results with our theoretical predictions.
We report the first whole-genome sequences for five strains, two carried and three pathogenic, of the emerging pathogen Haemophilus haemolyticus. Preliminary analyses indicate that these genome sequences encode markers that distinguish H. haemolyticus from its closest Haemophilus relatives and provide clues to the identity of its virulence factors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.