The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but a similar reference has lacked for epigenomic studies. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection to-date of human epigenomes for primary cells and tissues. Here, we describe the integrative analysis of 111 reference human epigenomes generated as part of the program, profiled for histone modification patterns, DNA accessibility, DNA methylation, and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically-relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation, and human disease.
Chromatin profiling has emerged as a powerful means for genome annotation and detection of regulatory activity. Here we map nine chromatin marks across nine cell types to systematically characterize regulatory elements, their cell type-specificities, and their functional interactions. Focusing on cell type-specific patterns of promoters and enhancers, we define multi-cell activity profiles for chromatin state, gene expression, regulatory motif enrichment, and regulator expression. We use correlations between these profiles to link enhancers to putative target genes, and predict the cell type-specific activators and repressors that modulate them. The resulting annotations and regulatory predictions have implications for interpreting genome-wide association studies. Top-scoring disease SNPs are frequently positioned within enhancer elements specifically active in relevant cell types, and in some cases affect a motif instance for a predicted regulator, thus proposing a mechanism for the association. Our study presents a general framework for deciphering cis-regulatory connections and their roles in disease.
Summary Chromatin is composed of DNA and a variety of modified histones and non-histone proteins, which impact cell differentiation, gene regulation and other key cellular processes. We present a genome-wide chromatin landscape for Drosophila melanogaster based on 18 histone modifications, summarized by 9 prevalent combinatorial patterns. Integrative analysis with other data (non-histone chromatin proteins, DNaseI hypersensitivity, GRO-seq reads produced by engaged polymerase, short/long RNA products) reveals discrete characteristics of chromosomes, genes, regulatory elements, and other functional domains. We find that active genes display distinct chromatin signatures that are correlated with disparate gene lengths, exon patterns, regulatory functions, and genomic contexts. We also demonstrate a diversity of signatures among Polycomb targets that include a subset with paused polymerase. This systematic profiling and integrative analysis of chromatin signatures provides insights into how genomic elements are regulated, and will serve as a resource for future experimental investigations of genome structure and function.
Comparison of related genomes has emerged as a powerful lens for genome interpretation. Here, we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and report constrained elements covering ~4.2% of the genome. We use evolutionary signatures and comparison with experimental datasets to suggest candidate functions for ~60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events, and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements, and ~1,000 primate- and human-accelerated elements. Overlap with disease-associated variants suggests our findings will be relevant for studies of human biology and health.
The mission of the Encyclopedia of DNA Elements (ENCODE) Project is to enable the scientific and medical communities to interpret the human genome sequence and apply it to understand human biology and improve health. The ENCODE Consortium is integrating multiple technologies and approaches in a collective effort to discover and define the functional elements encoded in the human genome, including genes, transcripts, and transcriptional regulatory regions, together with their attendant chromatin states and DNA methylation patterns. In the process, standards to ensure high-quality data have been implemented, and novel algorithms have been developed to facilitate analysis. Data and derived results are made available through a freely accessible database. Here we provide an overview of the project and the resources it is generating and illustrate the application of ENCODE data to interpret the human genome.
A plethora of epigenetic modifications have been described in the human genome and shown to play diverse roles in gene regulation, cellular differentiation, and the onset of disease. While some modifications have been linked with activity levels of different functional elements, their combinatorial patterns remain unresolved, and their potential for systematic de novo genome annotation remains untapped. In this paper, we systematically discover and characterize recurrent spatially-coherent and biologically-meaningful chromatin mark combinations, or chromatin states, in human T-cells. We describe 51 distinct chromatin states, including promoter-associated, transcription-associated, active intergenic, large-scale repressed and repeat-associated states. Each chromatin state shows specific functional, experimental, conservation, annotation, and sequence-motif enrichments, revealing their distinct candidate biological roles. Overall, our work provides a complementary functional annotation of the human genome revealing the genome-wide locations of diverse classes of epigenetic functions, including previously-unsuspected chromatin states enriched in transcription end sites, distinct repeat families, and disease-SNP-associated states.
Here, we leverage a unique collection of 708 prospectively collected autopsied brains to assess the methylation state of the brain's DNA in relation to Alzheimer's disease (AD). We find that the level of methylation at 71 of the 415,848 interrogated CpGs is significantly associated with the burden of AD pathology, including CpGs in the ABCA7 and BIN1 regions, which harbor known AD susceptibility variants. We validate 11 of the differentially methylated regions in an independent set of 117 subjects. Further, we functionally validate these CpG associations and identify the nearby genes whose RNA expression is altered in AD: ANK1, CDH23, DIP2A, RHBDF2, RPL13, RNF34, SERPINF1 and SERPINF2. Our analyses suggest that these DNA methylation changes may have a role in the onset of AD since (1) they are seen in presymptomatic subjects and (2) six of the validated genes connect to a known AD susceptibility gene network.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.