Epigenetic modifications are dynamic control mechanisms involved in the regulation of gene expression. Unlike the DNA sequence itself, they vary not only between individuals but also between different cell types of the same individual. Exposure to environmental factors, somatic mutations, and ageing contribute to epigenomic changes over time, which may constitute early hallmarks or causal factors of disease. Epigenetic changes are reversible and, therefore, promising therapeutic targets. However, mapping efforts to determine an individual's cell-type-specific epigenome are constrained by experimental costs. We developed eDICE, an attention-based deep learning model, to impute epigenomic tracks. eDICE achieves improved overall performance compared to previous models on the reference Roadmap epigenomes. Furthermore, we present a proof of concept for the imputation of personalised epigenomic measurements on the ENTEx dataset, where eDICE correctly predicts individual- and cell-type-specific epigenetic patterns. This case study constitutes an important step towards robustly employing machine-learning-based approaches for personalised epigenomics.
Epigenetic mechanisms coordinate packaging, accessibility and read-out of the DNA sequence within the chromatin context. They significantly contribute to the regulation of gene expression. Thus, they play fundamental roles during differentiation on the one hand and maintenance and propagation of cell identity on the other. Epigenetic malfunctioning is associated with a large range of diseases, from neurodevelopmental disorders to cancer progression. In humans, hundreds of known epigenetic factors and complexes are involved in establishing covalent modifications on the DNA sequence itself and on associated histone proteins. Within the cellular context, the resulting combinatorial epigenomic patterns are neither established nor interpreted independently of each other and therefore exhibit high correlations in a region-specific manner. Post-translational modifications of histone proteins can be analysed using Chromatin Immunoprecipitation followed by sequencing (ChIP-Seq). Often, several assays for a number of different histone modifications are performed as part of the same experimental design. These measurements are, however, confounded by shared biases including chromatin accessibility and mappability. Existing computational methods analyse each histone modification separately. We introduce DecoDen, a new approach that leverages replicates and multi-histone ChIP-Seq experiments for a fixed cell type to learn and remove shared biases. DecoDen (Deconvolve and Denoise) consists of two major steps: We use non-negative matrix factorisation (NMF) to learn a joint cell-type specific background signal. Half-sibling regression (HSR) is then used to correct for these biases in the histone modification signals. We demonstrate that DecoDen is a robust and interpretable method that enables the unbiased discovery of subtle peaks, which are particularly important in an individual-specific context.
Epigenetic modifications are dynamic control mechanisms involved in the regulation of gene expression. Unlike the DNA sequence itself, they vary not only between individuals but also between different cell types of the same individual. Exposure to environmental factors, somatic mutations, and ageing contribute to epigenomic changes over time, which may constitute early hallmarks or causal factors of disease. Epigenetic changes are reversible and, therefore, promising therapeutic targets. However, mapping efforts to determine an individual’s cell-type-specific epigenome are constrained by experimental costs. We developed eDICE, an attention-based deep learning model, to impute epigenomic tracks. eDICE achieves improved overall performance compared to previous models on the reference Roadmap epigenomes. Furthermore, we present a proof of concept for the imputation of personalised epigenomic measurements on the ENTEx dataset, where eDICE correctly predicts individual- and cell-type-specific epigenetic patterns. This case study constitutes an important step towards robustly employing machine-learning-based approaches for personalised epigenomics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.