Each unit of the D4Z4 macrosatellite repeat contains a retrotransposed gene encoding the DUX4 double-homeobox transcription factor. Facioscapulohumeral dystrophy (FSHD) is caused by deletion of a subset of the D4Z4 units in the subtelomeric region of chromosome 4. Although it has been reported that the deletion of D4Z4 units induces the pathological expression of DUX4 mRNA, the association of DUX4 mRNA expression with FSHD has not been rigorously investigated, nor has any human tissue been identified that normally expresses DUX4 mRNA or protein. We show that FSHD muscle expresses a different splice form of DUX4 mRNA compared to control muscle. Control muscle produces low amounts of a splice form of DUX4 encoding only the amino-terminal portion of DUX4. FSHD muscle produces low amounts of a DUX4 mRNA that encodes the full-length DUX4 protein. The low abundance of full-length DUX4 mRNA in FSHD muscle cells represents a small subset of nuclei producing a relatively high abundance of DUX4 mRNA and protein. In contrast to control skeletal muscle and most other somatic tissues, full-length DUX4 transcript and protein is expressed at relatively abundant levels in human testis, most likely in the germ-line cells. Induced pluripotent (iPS) cells also express full-length DUX4 and differentiation of control iPS cells to embryoid bodies suppresses expression of full-length DUX4, whereas expression of full-length DUX4 persists in differentiated FSHD iPS cells. Together, these findings indicate that full-length DUX4 is normally expressed at specific developmental stages and is suppressed in most somatic tissues. The contraction of the D4Z4 repeat in FSHD results in a less efficient suppression of the full-length DUX4 mRNA in skeletal muscle cells. Therefore, FSHD represents the first human disease to be associated with the incomplete developmental silencing of a retrogene array normally expressed early in development.
We applied a combinatorial indexing assay, sci-ATAC-seq, to profile genome-wide chromatin accessibility in ∼100,000 single cells from 13 adult mouse tissues. We identify 85 distinct patterns of chromatin accessibility, most of which can be assigned to cell types, and ∼400,000 differentially accessible elements. We use these data to link regulatory elements to their target genes, to define the transcription factor grammar specifying each cell type, and to discover in vivo correlates of heterogeneity in accessibility within cell types. We develop a technique for mapping single cell gene expression data to single-cell chromatin accessibility data, facilitating the comparison of atlases. By intersecting mouse chromatin accessibility with human genome-wide association summary statistics, we identify cell-type-specific enrichments of the heritability signal for hundreds of complex traits. These data define the in vivo landscape of the regulatory genome for common mammalian cell types at single-cell resolution.
Facioscapulohumeral dystrophy (FSHD) is characterized by chromatin relaxation of the D4Z4 macrosatellite array on chromosome 4 and expression of the D4Z4-encoded DUX4 gene in skeletal muscle. The more common form, autosomal dominant FSHD1, is caused by a contraction of the D4Z4 array, whereas the genetic determinants and inheritance of D4Z4 array contraction-independent FSHD2 are unclear. Here we show that mutations in SMCHD1 (structural maintenance of chromosomes flexible hinge domain containing 1) on chromosome 18 reduce SMCHD1 protein levels and segregate with genome-wide D4Z4 CpG hypomethylation in human kindreds. FSHD2 occurs in individuals who inherited both the SMCHD1 mutation and a normal-sized D4Z4 array on a chromosome 4 haplotype permissive for DUX4 expression. Reducing SMCHD1 levels in skeletal muscle results in contraction-independent DUX4 expression. Our study identifies SMCHD1 as an epigenetic modifier of the D4Z4 metastable epiallele and as a causal genetic determinant of FSHD2 and possibly other human diseases subject to epigenetic regulation.
We have isolated and analyzed human CTCF cDNA clones and show here that the ubiquitously expressed 11-zinc-finger factor CTCF is an exceptionally highly conserved protein displaying 93% identity between avian and human amino acid sequences. It binds specifically to regulatory sequences in the promoter-proximal regions of chicken, mouse, and human c-myc oncogenes. CTCF contains two transcription repressor domains transferable to a heterologous DNA binding domain. One CTCF binding site, conserved in mouse and human c-myc genes, is found immediately downstream of the major P2 promoter at a sequence which maps precisely within the region of RNA polymerase II pausing and release. Gel shift assays of nuclear extracts from mouse and human cells show that CTCF is the predominant factor binding to this sequence. Mutational analysis of the P2-proximal CTCF binding site and transient-cotransfection experiments demonstrate that CTCF is a transcriptional repressor of the human c-myc gene. Although there is 100% sequence identity in the DNA binding domains of the avian and human CTCF proteins, the regulatory sequences recognized by CTCF in chicken and human c-myc promoters are clearly diverged. Mutating the contact nucleotides confirms that CTCF binding to the human c-myc P2 promoter requires a number of unique contact DNA bases that are absent in the chicken c-myc CTCF binding site. Moreover, proteolytic-protection assays indicate that several more CTCF Zn fingers are involved in contacting the human CTCF binding site than the chicken site. Gel shift assays utilizing successively deleted Zn finger domains indicate that CTCF Zn fingers 2 to 7 are involved in binding to the chicken c-myc promoter, while fingers 3 to 11 mediate CTCF binding to the human promoter. This flexibility in Zn finger usage reveals CTCF to be a unique "multivalent" transcriptional factor and provides the first feasible explanation of how certain homologous genes (i.e., c-myc) of different vertebrate species are regulated by the same factor and maintain similar expression patterns despite significant promoter sequence divergence.
Cohesin is required to prevent premature dissociation of sister chromatids after DNA replication. Although its role in chromatid cohesion is well established, the functional significance of cohesin's association with interphase chromatin is not clear. Using a quantitative proteomics approach, we show that the STAG1 (Scc3/ SA1) subunit of cohesin interacts with the CCTC-binding factor CTCF bound to the c-myc insulator element. Both allele-specific binding of CTCF and Scc3/SA1 at the imprinted IGF2/H19 gene locus and our analyses of human DM1 alleles containing base substitutions at CTCF-binding motifs indicate that cohesin recruitment to chromosomal sites depends on the presence of CTCF. A large-scale genomic survey using ChIP-Chip demonstrates that Scc3/SA1 binding strongly correlates with the CTCF-binding site distribution in chromosomal arms. However, some chromosomal sites interact exclusively with CTCF, whereas others interact with Scc3/SA1 only. Furthermore, immunofluorescence microscopy and ChIP-Chip experiments demonstrate that CTCF associates with both centromeres and chromosomal arms during metaphase. These results link cohesin to gene regulatory functions and suggest an essential role for CTCF during sister chromatid cohesion. These results have implications for the functional role of cohesin subunits in the pathogenesis of Cornelia de Lange syndrome and Roberts syndromes.cohesion ͉ transcription ͉ insulator ͉ centromere ͉ metaphase
Prior studies of the DM1 locus have shown that the CTG repeats are a component of a CTCF-dependent insulator element and that repeat expansion results in conversion of the region to heterochromatin. We now show that the DM1 insulator is maintained in a local heterochromatin context: an antisense transcript emanating from the adjacent SIX5 regulatory region extends into the insulator element and is converted into 21 nucleotide (nt) fragments with associated regional histone H3 lysine 9 (H3-K9) methylation and HP1gamma recruitment that is embedded within a region of euchromatin-associated H3 lysine 4 (H3-K4) methylation. CTCF restricts the extent of the antisense RNA at the wild-type (wt) DM1 locus and constrains the H3-K9 methylation to the nucleosome associated with the CTG repeat, whereas the expanded allele in congenital DM1 is associated with loss of CTCF binding, spread of heterochromatin, and regional CpG methylation.
An expansion of a CTG repeat at the DM1 locus causes myotonic dystrophy (DM) by altering the expression of the two adjacent genes, DMPK and SIX5, and through a toxic effect of the repeat-containing RNA. Here we identify two CTCF-binding sites that flank the CTG repeat and form an insulator element between DMPK and SIX5. Methylation of these sites prevents binding of CTCF, indicating that the DM1 locus methylation in congenital DM would disrupt insulator function. Furthermore, CTCF-binding sites are associated with CTG/CAG repeats at several other loci. We suggest a general role for CTG/CAG repeats as components of insulator elements at multiple sites in the human genome.
Eukaryotic transcriptional regulation often involves regulatory elements separated from the cognate genes by long distances, whereas appropriately positioned insulator or enhancer-blocking elements shield promoters from illegitimate enhancer action. Four proteins have been identified in Drosophila mediating enhancer blocking-Su(Hw), Zw5, BEAF32 and GAGA factor. In vertebrates, the single protein CTCF, with 11 highly conserved zinc fingers, confers enhancer blocking in all known chromatin insulators. Here, we characterize an orthologous CTCF factor in Drosophila with a similar domain structure, binding site specificity and transcriptional repression activity as in vertebrates. In addition, we demonstrate that one of the insulators (Fab-8) in the Drosophila Abdominal-B locus mediates enhancer blocking by dCTCF. Therefore, the enhancer-blocking protein CTCF and, most probably, the mechanism of enhancer blocking mediated by this remarkably versatile factor are conserved from Drosophila to humans.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.