Background Pathogenic mutations in genes that control chromatin function have been implicated in rare genetic syndromes. These chromatin modifiers exhibit extraordinary diversity in the scale of the epigenetic changes they affect, from single basepair modifications by DNMT1 to whole genome structural changes by PRM1/2. Patterns of DNA methylation are related to a diverse set of epigenetic features across this full range of epigenetic scale, making DNA methylation valuable for mapping regions of general epigenetic dysregulation. However, existing methods are unable to accurately identify regions of differential methylation across this full range of epigenetic scale directly from DNA methylation data. Results To address this, we developed DMRscaler, a novel method that uses an iterative windowing procedure to capture regions of differential DNA methylation (DMRs) ranging in size from single basepairs to whole chromosomes. We benchmarked DMRscaler against several DMR callers in simulated and natural data comparing XX and XY peripheral blood samples. DMRscaler was the only method that accurately called DMRs ranging in size from 100 bp to 1 Mb (pearson's r = 0.94) and up to 152 Mb on the X-chromosome. We then analyzed methylation data from rare-disease cohorts that harbor chromatin modifier gene mutations in NSD1, EZH2, and KAT6A where DMRscaler identified novel DMRs spanning gene clusters involved in development. Conclusion Taken together, our results show DMRscaler is uniquely able to capture the size of DMR features across the full range of epigenetic scale and identify novel, co-regulated regions that drive epigenetic dysregulation in human disease.
Pathogenic mutations in genes that control chromatin structure and function cause epigenetic aberrations that result in rare genetic syndromes. These chromatin modifiers exhibit extraordinary diversity in the scale of the epigenetic changes they affect, from single basepair modifications by DNMT1 to whole genome structural changes by PRM1/2. Patterns of DNA methylation are related to a diverse set of epigenetic features across this full range of epigenetic scale, making DNA methylation valuable for mapping regions of general epigenetic dysregulation. However, no existing methods make use of these relations to accurately identify the scale of epigenetic changes directly from DNA methylation data. To address this, we developed DMRscaler, a novel method that uses an iterative windowing procedure to capture regions of differential DNA methylation (DMRs) ranging in size from single basepairs to whole chromosomes. We benchmarked DMRscaler against several methylation callers in both simulated and natural data comparing XX and XY peripheral blood samples. DMRscaler was the only method that accurately called DMRs ranging in size from 100 bp to 1 Mb (pearson's r = 0.92) and up to 152Mb on the X-chromosome. We then analyzed methylation data from rare-disease cohorts that harbor mutations in the chromatin modifier genes NSD1, EZH2, and KAT6A. DMRscaler identified full or partial novel DMRs spanning PCDHA, PCDHB and PCDHGB gene clusters across these three groups suggesting these are common mechanisms driving their dysregulation in early synaptic development. Taken together, our results show DMRscaler is uniquely able to capture the scale of DMR features and identify novel, co-regulated regions that drive epigenetic dysregulation in human disease.
One of the most common human inborn errors of immunity (IEI) is Common Variable Immunodeficiency (CVID), a heterogeneous group of disorders characterized by a state of functional and/or quantitative antibody deficiency and impaired B-cell responses. Although over 30 genes have been associated with the CVID phenotype, over half the CVID patients have no identified monogenic variant. There are currently no existing laboratory or genetic tests to definitively diagnose CVID and none are expected to be available in the near future. The extensive heterogeneity of CVID phenotypes causes patients with CVID to face a 5 to 15 years of delay in diagnosis and initiation of treatment, leading to a critical diagnosis odyssey. In this work, we present PheNet, an algorithm that identifies patients with CVID from their electronic health record data (EHR). PheNet computes the likelihood of a patient having CVID by learning phenotypic patterns, EHR-signatures, from a high-quality, clinically curated list of bona fide CVID patients identified from the UCLA Health system (N=197). The prediction model attains superior accuracy versus state-of-the-art methods, where we find that 57% of cases could be detected within the top 10% of individuals ranked by the algorithm compared to 37% identified by previous phenotype risk scores. In a retrospective analysis, we show that 64% of CVID patients at UCLA Health could have been identified by PheNet more than 8 months earlier than they had been clinically diagnosed. We validate our approach using a discovery dataset of ~880K patients in the UCLA Health system to identify 74 of the top 100 patients ranked by PheNet score (top 0.01% PheNet percentile) as highly probable to have CVID in a clinical blinded chart review by an immune specialist.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.