SUMMARY Super-enhancers are large clusters of transcriptional enhancers that drive expression of genes that define cell identity. Improved understanding of the roles super-enhancers play in biology would be afforded by knowing the constellation of factors that constitute these domains and by identifying super-enhancers across the spectrum of human cell types. We describe here the population of transcription factors, cofactors, chromatin regulators and transcription apparatus occupying super-enhancers in embryonic stem cells and evidence that super-enhancers are highly transcribed. We produce a catalogue of super-enhancers in a broad range of human cell types, and find that super-enhancers associate with genes that control and define the biology of these cells. Interestingly, disease-associated variation is especially enriched in the super-enhancers of disease-relevant cell types. Furthermore, we find that cancer cells generate super-enhancers at oncogenes and other genes important in tumor pathogenesis. Thus, super-enhancers play key roles in human cell identity in health and disease.
In the Drosophila germline, repeat-associated small interfering RNAs (rasiRNAs) ensure genomic stability by silencing endogenous selfish genetic elements such as retrotransposons and repetitive sequences. Whereas small interfering RNAs (siRNAs) derive from both the sense and antisense strands of their double-stranded RNA precursors, rasiRNAs arise mainly from the antisense strand. rasiRNA production appears not to require Dicer-1, which makes microRNAs (miRNAs), or Dicer-2, which makes siRNAs, and rasiRNAs lack the 2',3' hydroxy termini characteristic of animal siRNA and miRNA. Unlike siRNAs and miRNAs, rasiRNAs function through the Piwi, rather than the Ago, Argonaute protein subfamily. Our data suggest that rasiRNAs protect the fly germline through a silencing mechanism distinct from both the miRNA and RNA interference pathways.
Oncogenes are activated through well-known chromosomal alterations, including gene fusion, translocation and focal amplification. Recent evidence that the control of key genes depends on chromosome structures called insulated neighborhoods led us to investigate whether proto-oncogenes occur within these structures and if oncogene activation can occur via disruption of insulated neighborhood boundaries in cancer cells. We mapped insulated neighborhoods in T-cell acute lymphoblastic leukemia (T-ALL), and found that tumor cell genomes contain recurrent microdeletions that eliminate the boundary sites of insulated neighborhoods containing prominent T-ALL proto-oncogenes. Perturbation of such boundaries in non-malignant cells was sufficient to activate proto-oncogenes. Mutations affecting chromosome neighborhood boundaries were found in many types of cancer. Thus, oncogene activation can occur via genetic alterations that disrupt insulated neighborhoods in malignant cells.
SUMMARY There is considerable evidence that chromosome structure plays important roles in gene control, but we have limited understanding of the proteins that contribute to structural interactions between gene promoters and their enhancer elements. Large DNA loops that encompass genes and their regulatory elements depend on CTCF-CTCF interactions, but most enhancer-promoter interactions do not employ this structural protein. Here, we show that the ubiquitously expressed transcription factor Yin Yang 1 (YY1) contributes to enhancer-promoter structural interactions in a manner analogous to DNA interactions mediated by CTCF. YY1 binds to active enhancers and promoter-proximal elements and forms dimers that facilitate the interaction of these DNA elements. Deletion of YY1 binding sites or depletion of YY1 protein disrupts enhancer-promoter looping and gene expression. We propose that YY1-mediated enhancer-promoter interactions are a general feature of mammalian gene control.
Summary Gene expression analysis is a widely used and powerful method for investigating the transcriptional behavior of biological systems, for classifying cell states in disease and for many other purposes. Recent studies indicate that common assumptions currently embedded in experimental and analytical practices can lead to misinterpretation of global gene expression data. We discuss these assumptions and describe solutions that should minimize erroneous interpretation of gene expression data from multiple analysis platforms.
Many long noncoding RNA (lncRNA) species have been identified in mammalian cells, but the genomic origin and regulation of these molecules in individual cell types is poorly understood. We have generated catalogs of lncRNA species expressed in human and murine embryonic stem cells and mapped their genomic origin. A surprisingly large fraction of these transcripts (>60%) originate from divergent transcription at promoters of active protein-coding genes. The divergently transcribed lncRNA/mRNA gene pairs exhibit coordinated changes in transcription when embryonic stem cells are differentiated into endoderm. Our results reveal that transcription of most lncRNA genes is coordinated with transcription of protein-coding genes.development | expression T he non-protein-coding portion of the mammalian genome is transcribed into a vast array of RNA species (1), some of which play important roles in cellular regulation, development, and disease (2). The long noncoding RNAs (lncRNAs) are of particular interest because they are known to contribute to gene silencing (3), X inactivation (4), imprinting (5, 6), and development (7-9), but there is limited understanding of the genomic origin, regulation, and function of lncRNA molecules in individual cell types.Embryonic stem cells (ESCs) are widely used as a model system to study transcriptional control of cell state during early development (10-13), yet there is no catalog of lncRNAs in human (h) ESCs, and it is not clear how lncRNAs are regulated in these cells. Catalogs of lncRNAs have been recently described in various murine (14, 15) and human cell types (16)(17)(18)(19), but the majority were limited to spliced lncRNA species (14-16, 18) and those distant from protein-coding genes (14-17). Because lncRNAs tend to be cell-type-specific (16, 18), these catalogs likely contain only a very small fraction of lncRNAs expressed in hESCs.We describe here catalogs of human and murine ESC lncRNAs and the genomic regions from which these RNA species arise. We find that the majority of these lncRNAs originate from divergent transcription of lncRNA/mRNA gene pairs and that many such gene pairs are coordinately regulated when ESCs differentiate.Results lncRNAs Expressed in Human ESCs. We compiled a catalog of lncRNA species expressed in hESCs as summarized in Fig. 1A. An initial pool of RNA candidates was generated by sequencing polyadenylated RNA species from hESCs and supplementing these with EST data from the full-length long Japan (FLJ) collection of sequenced human cDNAs, which contains transcripts expressed in >60 human tissues, including embryonal tissue (20). An initial pool of 170,162 ncRNA candidates (Dataset S1) was obtained after removing protein-coding transcripts based on the National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq). This pool was further filtered by using multiple criteria to identify lncRNAs. The RNA species were required to have a 5′ end that originates from a genomic site where there is corroborating evidence of active transcript...
Transcription factors (TFs) bind specific sequences in promoter-proximal and distal DNA elements in order to regulate gene transcription. RNA is transcribed from both of these DNA elements, and some DNA-binding TFs bind RNA. Hence, RNA transcribed from regulatory elements may contribute to stable TF occupancy at these sites. We show that the ubiquitously expressed TF YY1 binds to both gene regulatory elements and also to their associated RNA species genome-wide. Reduced transcription of regulatory elements diminishes YY1 occupancy whereas artificial tethering of RNA enhances YY1 occupancy at these elements. We propose that RNA makes a modest but important contribution to the maintenance of certain TFs at gene regulatory elements and suggest that transcription of regulatory elements produces a positive feedback loop that contributes to the stability of gene expression programs.
A vast number of small-molecule ligands, including therapeutic drugs under development and in clinical use, elicit their effects by binding specific proteins associated with the genome. An ability to map the direct interactions of a chemical entity with chromatin genome-wide could provide new and important insights into chemical perturbation of cellular function. Here we describe a method that couples ligand-affinity capture and massively parallel DNA sequencing (Chem-seq) to identify the sites bound by small chemical molecules throughout the human genome. We show how Chem-seq can be combined with ChIP-seq to gain unique insights into the interaction of drugs with their target proteins throughout the genome of tumor cells. These methods provide a powerful approach to enhance understanding of therapeutic action and characterize the specificity of chemical entities that interact with DNA or genome-associated proteins.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.