Advances in DNA sequencing technologies have revolutionised rare disease diagnostics and have led to a dramatic increase in the volume of available genomic data. A key challenge that needs to be overcome to realise the full potential of these technologies is that of precisely predicting the effect of genetic variants on molecular and organismal phenotypes. Notably, despite recent progress, there is still a lack of robust in silico tools that accurately assign clinical significance to variants. Genetic alterations in the CACNA1F gene are the commonest cause of X-linked incomplete Congenital Stationary Night Blindness (iCSNB), a condition associated with non-progressive visual impairment. We combined genetic and homology modelling data to produce CACNA1F-vp, an in silico model that differentiates disease-implicated from benign missense CACNA1F changes. CACNA1F-vp predicts variant effects on the structure of the CACNA1F encoded protein (a calcium channel) using parameters based upon changes in amino acid properties; these include size, charge, hydrophobicity, and position. The model produces an overall score for each variant that can be used to predict its pathogenicity. CACNA1F-vp outperformed four other tools in identifying disease-implicated variants (area under receiver operating characteristic and precision recall curves = 0.84; Matthews correlation coefficient = 0.52) using a tenfold cross-validation technique. We consider this protein-specific model to be a robust stand-alone diagnostic classifier that could be replicated in other proteins and could enable precise and timely diagnosis.
Inherited eye disorders (IED) are a heterogeneous group of Mendelian conditions that are associated with visual impairment. Although these disorders often exhibit incomplete penetrance and variable expressivity, the scale and mechanisms of these phenomena remain largely unknown. Here, we utilize publicly-available genomic and transcriptomic datasets to gain insights into variable penetrance in IED. Variants in a curated set of 340 IED-implicated genes were extracted from the Human Gene Mutation Database (HGMD) 2019.1 and cross-checked with the Genome Aggregation Database (gnomAD) 2.1 control-only dataset. Genes for which >1 variants were encountered in both HGMD and gnomAD were considered to be associated with variable penetrance (n = 56). Variability in gene expression levels was then estimated for the subset of these genes that was found to be adequately expressed in two relevant resources: the Genotype-Tissue Expression (GTEx) and Eye Genotype Expression (EyeGEx) datasets. We found that genes suspected to be associated with variable penetrance tended to have significantly more variability in gene expression levels in the general population (p = 0.0000015); this finding was consistent across tissue types. The results of this study point to the possible influence of cis and/or trans-acting elements on the expressivity of variants causing Mendelian disorders. They also highlight the potential utility of quantifying gene expression as part of the investigation of families showing evidence of variable penetrance.
BackgroundImproving the clinical interpretation of missense variants can increase the diagnostic yield of genomic testing and lead to personalised management strategies. Currently, due to the imprecision of bioinformatic tools that aim to predict variant pathogenicity, their role in clinical guidelines remains limited. There is a clear need for more accurate prediction algorithms and this study aims to improve performance by harnessing structural biology insights. The focus of this work is missense variants in a subset of genes associated with X linked disorders.MethodsWe have developed a protein-specific variant interpreter (ProSper) that combines genetic and protein structural data. This algorithm predicts missense variant pathogenicity by applying machine learning approaches to the sequence and structural characteristics of variants.ResultsProSper outperformed seven previously described tools, including meta-predictors, in correctly evaluating whether or not variants are pathogenic; this was the case for 11 of the 21 genes associated with X linked disorders that met the inclusion criteria for this study. We also determined gene-specific pathogenicity thresholds that improved the performance of VEST4, REVEL and ClinPred, the three best-performing tools out of the seven that were evaluated; this was the case in 11, 11 and 12 different genes, respectively.ConclusionProSper can form the basis of a molecule-specific prediction tool that can be implemented into diagnostic strategies. It can allow the accurate prioritisation of missense variants associated with X linked disorders, aiding precise and timely diagnosis. In addition, we demonstrate that gene-specific pathogenicity thresholds for a range of missense prioritisation tools can lead to an increase in prediction accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.