The prevalent DNA modification in higher organisms is the methylation of cytosine to 5-methylcytosine (5mC), which is partially converted to 5-hydroxymethylcytosine (5hmC) by the Tet (ten eleven translocation) family of dioxygenases. Despite their importance in epigenetic regulation, it is unclear how these cytosine modifications are reversed. Here, we demonstrate that 5mC and 5hmC in DNA are oxidized to 5-carboxylcytosine (5caC) by Tet dioxygenases in vitro and in cultured cells. 5caC is specifically recognized and excised by thymine-DNA glycosylase (TDG). Depletion of TDG in mouse embyronic stem cells leads to accumulation of 5caC to a readily detectable level. These data suggest that oxidation of 5mC by Tet proteins followed by TDG-mediated base excision of 5caC constitutes a pathway for active DNA demethylation.
Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly ‘housekeeping’, whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.
Mammalian promoters can be separated into two classes, conserved TATA box-enriched promoters, which initiate at a well-defined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3' UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.
This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.
The recognition of specific DNA sequences by proteins is thought to depend on two types of mechanisms: one that involves the formation of hydrogen bonds with specific bases, primarily in the major groove, and one involving sequence-dependent deformations of the DNA helix. By comprehensively analyzing the three dimensional structures of protein-DNA complexes, we show that the binding of arginines to narrow minor grooves is a widely used mode for protein-DNA recognition. This readout mechanism exploits the phenomenon that narrow minor grooves strongly enhance the negative electrostatic potential of the DNA. The nucleosome core particle offers a striking example of this effect. Minor groove narrowing is often associated with the presence of A-tracts, AT-rich sequences that exclude the flexible TpA step. These findings suggest that the ability to detect local variations in DNA shape and electrostatic potential is a general mechanism that enables proteins to use information in the minor groove, which otherwise offers few opportunities for the formation of base-specific hydrogen bonds, to achieve DNA binding specificity.
Background: The coronavirus disease 2019 (COVID-19) outbreak originating in Wuhan, Hubei province, China, coincided with chunyun, the period of mass migration for the annual Spring Festival. To contain its spread, China adopted unprecedented nationwide interventions on January 23 2020. These policies included large-scale quarantine, strict controls on travel and extensive monitoring of suspected cases. However, it is unknown whether these policies have had an impact on the epidemic. We sought to show how these control measures impacted the containment of the epidemic. Methods: We integrated population migration data before and after January 23 and most updated COVID-19 epidemiological data into the Susceptible-Exposed-Infectious-Removed (SEIR) model to derive the epidemic curve. We also used an artificial intelligence (AI) approach, trained on the 2003 SARS data, to predict the epidemic. Results: We found that the epidemic of China should peak by late February, showing gradual decline by end of April. A five-day delay in implementation would have increased epidemic size in mainland China three-fold. Lifting the Hubei quarantine would lead to a second epidemic peak in Hubei province in mid-March and extend the epidemic to late April, a result corroborated by the machine learning prediction. Conclusions: Our dynamic SEIR model was effective in predicting the COVID-19 epidemic peaks and sizes. The implementation of control measures on January 23 2020 was indispensable in reducing the eventual COVID-19 epidemic size.
We have analyzed the maize leaf transcriptome using Illumina sequencing. We mapped more than 120 million reads to define gene structure and alternative splicing events and to quantify transcript abundance along a leaf developmental gradient and in mature bundle sheath and mesophyll cells. We detected differential mRNA processing events for most maize genes. We found that 64% and 21% of genes were differentially expressed along the developmental gradient and between bundle sheath and mesophyll cells, respectively. We implemented Gbrowse, an electronic fluorescent pictograph browser, and created a two-cell biochemical pathway viewer to visualize datasets. Cluster analysis of the data revealed a dynamic transcriptome, with transcripts for primary cell wall and basic cellular metabolism at the leaf base transitioning to transcripts for secondary cell wall biosynthesis and C(4) photosynthetic development toward the tip. This dataset will serve as the foundation for a systems biology approach to the understanding of photosynthetic development.
The loss of the SOST gene product sclerostin leads to sclerosteosis characterized by high bone mass. In this report, we found that sclerostin could antagonize canonical Wnt signaling in human embryonic kidney A293T cells and mouse osteoblastic MC3T3 cells. This sclerostin-mediated antagonism could be reversed by overexpression of Wnt co-receptor low density lipoprotein receptor-related protein (LRP) 5. In addition, we found that sclerostin bound to LRP5 as well as LRP6 and identified the first two YWTD-EGF repeat domains of LRP5 as being responsible for the binding. Although these two repeat domains are required for transduction of canonical Wnt signals, canonical Wnt did not appear to compete with sclerostin for binding to LRP5. Examination of the expression of sclerostin and Wnt7b, an autocrine canonical Wnt, during primary calvarial osteoblast differentiation revealed that sclerostin is expressed at late stages of osteoblast differentiation coinciding with the expression of osteogenic marker osteocalcin and trailing after the expression of Wnt7b. Given the plethora of evidence indicating that canonical Wnt signaling stimulates osteogenesis, we believe that the high bone mass phenotype associated with the loss of sclerostin may be attributed, at least in part, to an increase in canonical Wnt signaling resulting from the reduction in sclerostinmediated Wnt antagonism.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2023 scite Inc. All rights reserved.
Made with 💙 for researchers