SummaryLong-range interactions between regulatory elements and gene promoters play key roles in transcriptional regulation. The vast majority of interactions are uncharted, constituting a major missing link in understanding genome control. Here, we use promoter capture Hi-C to identify interacting regions of 31,253 promoters in 17 human primary hematopoietic cell types. We show that promoter interactions are highly cell type specific and enriched for links between active promoters and epigenetically marked enhancers. Promoter interactomes reflect lineage relationships of the hematopoietic tree, consistent with dynamic remodeling of nuclear architecture during differentiation. Interacting regions are enriched in genetic variants linked with altered expression of genes they contact, highlighting their functional role. We exploit this rich resource to connect non-coding disease variants to putative target promoters, prioritizing thousands of disease-candidate genes and implicating disease pathways. Our results demonstrate the power of primary cell promoter interactomes to reveal insights into genomic regulatory mechanisms underlying common diseases.
Epigenetic genome modifications are thought to be important for specifying the lineage and developmental stage of cells within a multicellular organism. Here, we show that the epigenetic profile of pluripotent embryonic stem cells (ES) is distinct from that of embryonic carcinoma cells, haematopoietic stem cells (HSC) and their differentiated progeny. Silent, lineage-specific genes replicated earlier in pluripotent cells than in tissue-specific stem cells or differentiated cells and had unexpectedly high levels of acetylated H3K9 and methylated H3K4. Unusually, in ES cells these markers of open chromatin were also combined with H3K27 trimethylation at some non-expressed genes. Thus, pluripotency of ES cells is characterized by a specific epigenetic profile where lineage-specific genes may be accessible but, if so, carry repressive H3K27 trimethylation modifications. H3K27 methylation is functionally important for preventing expression of these genes in ES cells as premature expression occurs in embryonic ectoderm development (Eed)-deficient ES cells. Our data suggest that lineage-specific genes are primed for expression in ES cells but are held in check by opposing chromatin modifications.
Cohesins mediate sister chromatid cohesion, which is essential for chromosome segregation and postreplicative DNA repair. In addition, cohesins appear to regulate gene expression and enhancer-promoter interactions. These noncanonical functions remained unexplained because knowledge of cohesin-binding sites and functional interactors in metazoans was lacking. We show that the distribution of cohesins on mammalian chromosome arms is not driven by transcriptional activity, in contrast to S. cerevisiae. Instead, mammalian cohesins occupy a subset of DNase I hypersensitive sites, many of which contain sequence motifs resembling the consensus for CTCF, a DNA-binding protein with enhancer blocking function and boundary-element activity. We find cohesins at most CTCF sites and show that CTCF is required for cohesin localization to these sites. Recruitment by CTCF suggests a rationale for noncanonical cohesin functions and, because CTCF binding is sensitive to DNA methylation, allows cohesin positioning to integrate DNA sequence and epigenetic state.
Capture Hi-C (CHi-C) is a method for profiling chromosomal interactions involving targeted regions of interest, such as gene promoters, globally and at high resolution. Signal detection in CHi-C data involves a number of statistical challenges that are not observed when using other Hi-C-like techniques. We present a background model and algorithms for normalisation and multiple testing that are specifically adapted to CHi-C experiments. We implement these procedures in CHiCAGO (http://regulatorygenomicsgroup.org/chicago), an open-source package for robust interaction detection in CHi-C. We validate CHiCAGO by showing that promoter-interacting regions detected with this method are enriched for regulatory features and disease-associated SNPs.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-016-0992-2) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.