Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly ‘housekeeping’, whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.
Mammalian promoters can be separated into two classes, conserved TATA box-enriched promoters, which initiate at a well-defined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3' UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.
This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.
Transplantation of dopaminergic neurons can potentially improve the clinical outcome of Parkinson's disease, a neurological disorder resulting from degeneration of mesencephalic dopaminergic neurons. In particular, transplantation of embryonic-stem-cell-derived dopaminergic neurons has been shown to be efficient in restoring motor symptoms in conditions of dopamine deficiency. However, the use of pluripotent-derived cells might lead to the development of tumours if not properly controlled. Here we identified a minimal set of three transcription factors--Mash1 (also known as Ascl1), Nurr1 (also known as Nr4a2) and Lmx1a--that are able to generate directly functional dopaminergic neurons from mouse and human fibroblasts without reverting to a progenitor cell stage. Induced dopaminergic (iDA) cells release dopamine and show spontaneous electrical activity organized in regular spikes consistent with the pacemaker activity featured by brain dopaminergic neurons. The three factors were able to elicit dopaminergic neuronal conversion in prenatal and adult fibroblasts from healthy donors and Parkinson's disease patients. Direct generation of iDA cells from somatic cells might have significant implications for understanding critical processes for neuronal development, in vitro disease modelling and cell replacement therapies.
Most of the mammalian genome is transcribed. This generates a vast repertoire of transcripts that includes protein-coding messenger RNAs, long non-coding RNAs (lncRNAs) and repetitive sequences, such as SINEs (short interspersed nuclear elements). A large percentage of ncRNAs are nuclear-enriched with unknown function. Antisense lncRNAs may form sense-antisense pairs by pairing with a protein-coding gene on the opposite strand to regulate epigenetic silencing, transcription and mRNA stability. Here we identify a nuclear-enriched lncRNA antisense to mouse ubiquitin carboxy-terminal hydrolase L1 (Uchl1), a gene involved in brain function and neurodegenerative diseases. Antisense Uchl1 increases UCHL1 protein synthesis at a post-transcriptional level, hereby identifying a new functional class of lncRNAs. Antisense Uchl1 activity depends on the presence of a 5' overlapping sequence and an embedded inverted SINEB2 element. These features are shared by other natural antisense transcripts and can confer regulatory activity to an artificial antisense to green fluorescent protein. Antisense Uchl1 function is under the control of stress signalling pathways, as mTORC1 inhibition by rapamycin causes an increase in UCHL1 protein that is associated to the shuttling of antisense Uchl1 RNA from the nucleus to the cytoplasm. Antisense Uchl1 RNA is then required for the association of the overlapping sense protein-coding mRNA to active polysomes for translation. These data reveal another layer of gene expression control at the post-transcriptional level.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.