Linker histones play a pivotal role in shaping chromatin architecture, notably through their globular H1 (GH1) domain that contacts the nucleosome and linker DNA. Yet, the interplay of H1 with chromatin factors along the epigenome landscape is poorly understood. Here, we report that Arabidopsis H1 favors chromatin compaction and H3K27me3 marking on a majority of Polycomb-targeted protein-coding genes while preventing H3K27me3 accumulation on telomeres and pericentromeric interstitial telomeric repeats (ITRs). These contrasting effects of H1 on H3K27me3 enrichment are associated with long-distance effects on the 3D organization of telomeres and ITRs. Mechanistically, H1 prevents ITRs from being invaded by Telomere Repeat Binding 1 (TRB1), a GH1-containing telomere component with an extra-telomeric function in targeting Polycomb to genes bearing telomeric motifs. We propose that reciprocal DNA binding of H1 and TRB1 to clustered telobox motifs prevents H3K27me3 accumulation on large chromosomal blocks, conferring a sequence-specific role to H1 in epigenome homeostasis.
Motivation Genome-wide chromosomal contact maps are widely used to uncover the 3D organization of genomes. They rely on collecting millions of contacting pairs of genomic loci. Contacts at short range are usually well measured in experiments, while there is a lot of missing information about long-range contacts. Results We propose to use the sparse information contained in raw contact maps to infer high-confidence contact counts between all pairs of loci. Our algorithmic procedure, Boost-HiC, enables the detection of Hi-C patterns such as chromosomal compartments at a resolution that would be otherwise only attainable by sequencing a hundred times deeper the experimental Hi-C library. Boost-HiC can also be used to compare contact maps at an improved resolution. Availability and implementation Boost-HiC is available at https://github.com/LeopoldC/Boost-HiC. Supplementary information Supplementary data are available at Bioinformatics online.
Genome-wide chromosomal contact maps are widely used to uncover the 3D organisation of genomes. They rely on the collection of millions of contacting pairs of genomic loci. Contact frequencies at short range are usually well measured in experiments, while there is a lot of missing information about long-range contacts.We propose to use the sparse information contained in raw contact maps to determine high-confidence contact frequency between all pairs of loci. Our algorithmic procedure, Boost-HiC, enables the detection of Hi-C patterns such as chromosomal compartments at a resolution that would be otherwise only attainable by sequencing a hundred times deeper the experimental Hi-C library.
The notion of disease-associated single-nucleotide polymorphisms (da-SNP), as determined in genome-wide association studies (GWAS), is relevant for many complex pathologies, including cancers. It appeared that da-SNPs are not only markers of causal genetic variation but may contribute to the disease development through an influence on gene expression levels. We argue that understanding this possible functional role of da-SNPs requires to consider their embedding in the tridimensional (3D) multi-scale organization of the human genome. We then focus on the potential impact of da-SNPs on chromatin loops and recently observed topologically associating domains (TADs). We show that for some diseases and cancer types, da-SNPs are over-represented in the borders of these topological domains, in a way that cannot be explained by an increased exon density. This analysis of the distribution of da-SNPs within the 3D genome organization suggests candidate loci for further experimental investigation of the mechanisms underlying genetic susceptibility to diseases, in particular cancer. Recently developed techniques of chromosome conformation capture combine chemical crosslinking and sequencing to identify genomic loci contacting each other in vivo. They have shown that the mammalian genome displays three main architectural features at the large-scale level (supranucleosomal level, beyond the kb scale), nested in a hierarchical way (Figure 2): chromatin loops, topologically associating domains (TADs) of larger size exhibiting more internal contacts than contacts between domains [34, 35], and a segregation in active and inactive compartments [36].
Background Genome-wide association studies have identified statistical associations between various diseases, including cancers, and a large number of single-nucleotide polymorphisms (SNPs). However, they provide no direct explanation of the mechanisms underlying the association. Based on the recent discovery that changes in three-dimensional genome organization may have functional consequences on gene regulation favoring diseases, we investigated systematically the genome-wide distribution of disease-associated SNPs with respect to a specific feature of 3D genome organization: topologically associating domains (TADs) and their borders. Results For each of 449 diseases, we tested whether the associated SNPs are present in TAD borders more often than observed by chance, where chance (i.e., the null model in statistical terms) corresponds to the same number of pointwise loci drawn at random either in the entire genome, or in the entire set of disease-associated SNPs listed in the GWAS catalog. Our analysis shows that a fraction of diseases displays such a preferential localization of their risk loci. Moreover, cancers are relatively more frequent among these diseases, and this predominance is generally enhanced when considering only intergenic SNPs. The structure of SNP-based diseasome networks confirms that localization of risk loci in TAD borders differs between cancers and non-cancer diseases. Furthermore, different TAD border enrichments are observed in embryonic stem cells and differentiated cells, consistent with changes in topological domains along embryogenesis and delineating their contribution to disease risk. Conclusions Our results suggest that, for certain diseases, part of the genetic risk lies in a local genetic variation affecting the genome partitioning in topologically insulated domains. Investigating this possible contribution to genetic risk is particularly relevant in cancers. This study thus opens a way of interpreting genome-wide association studies, by distinguishing two types of disease-associated SNPs: one with an effect on an individual gene, the other acting in interplay with 3D genome organization.
An increasing number of genomic tracks such as DNA methylation, histone modifications or transcriptomes are being produced to annotate genomes with functional states. The comparison of such high dimensional vectors obtained under various experimental conditions requires the use of a distance or dissimilarity measure. Pearson, Cosine and $L_{p}$-norm distances are commonly used for both count and binary vectors. In this article, we highlight how enhancement methods such as the contrast increasing mutual proximity’ (MP) or local scaling’ improve common distance measures. We present a systematic approach to evaluate the performance of such enhanced distance measures in terms of separability of groups of experimental replicates to outline their effect. We show that the MP’ applied on the various distance measures drastically increases performance. Depending on the type of epigenetic experiment, MP’ coupled together with Pearson, Cosine, $L_1$, Yule or Jaccard distances proves to be highly efficient in discriminating epigenomic profiles.
Genome-wide association studies have identified statistical associations between various diseases, including cancers, and a large number of single-nucleotide polymorphisms (SNPs). However, they provide no direct explanation of the mechanisms underlying the association. Based on the recent discovery that changes in 3-dimensional genome organization may have functional consequences on gene regulation favoring diseases, we investigated systematically the genome-wide distribution of disease-associated SNPs with respect to a specific feature of 3D genome organization: topologically-associating domains (TADs) and their borders. For each of 449 diseases, we tested whether the associated SNPs are present in TAD borders more often than observed by chance, where chance (i.e. the null model in statistical terms) corresponds to the same number of pointwise loci drawn at random either in the entire genome, or in the entire set of disease-associated SNPs listed in the GWAS catalog. Our analysis shows that a fraction of diseases display such a preferential location of their risk loci. Moreover, cancers are relatively more frequent among these diseases, and this predominance is generally enhanced when considering only intergenic SNPs. The structure of SNP-based diseasome networks confirms that TAD border enrichment in risk loci differ between cancers and non-cancer diseases. Different TAD border enrichments are observed in embryonic stem cells and differentiated cells, which agrees with an evolution along embryogenesis of the 3D genome organization into topological domains. Our results suggest that, for certain diseases, part of the genetic risk lies in a local genetic variation affecting the genome partitioning in topologically-insulated domains. Investigating this possible contribution to genetic risk is particularly relevant in cancers. This study thus opens a way of interpreting genome-wide association studies, by distinguishing two types of disease-associated SNPs: one with a direct effect on an individual gene, the other acting in interplay with 3D genome organization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.