2020
DOI: 10.1371/journal.pgen.1008922
|View full text |Cite|
|
Sign up to set email alerts
|

Unified inference of missense variant effects and gene constraints in the human genome

Abstract: A challenge in medical genomics is to identify variants and genes associated with severe genetic disorders. Based on the premise that severe, early-onset disorders often result in a reduction of evolutionary fitness, several statistical methods have been developed to predict pathogenic variants or constrained genes based on the signatures of negative selection in human populations. However, we currently lack a statistical framework to jointly predict deleterious variants and constrained genes from both variant… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
30
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2

Relationship

2
6

Authors

Journals

citations
Cited by 23 publications
(32 citation statements)
references
References 90 publications
(243 reference statements)
2
30
0
Order By: Relevance
“…2a). Because the UNEECON-G score is a measure of a gene's intolerance to missense mutations, it corroborates a previous finding that missense intolerance is strongly correlated with LOF intolerance at the gene level 32 . Two GO categories 24 , i.e., central nervous system development and embryo development, and the Reactome category of nervous system development 33 had strong positive associations with LOF intolerance, suggesting that developmental genes are highly intolerant to LOF mutations.…”
Section: Resultssupporting
confidence: 82%
See 1 more Smart Citation
“…2a). Because the UNEECON-G score is a measure of a gene's intolerance to missense mutations, it corroborates a previous finding that missense intolerance is strongly correlated with LOF intolerance at the gene level 32 . Two GO categories 24 , i.e., central nervous system development and embryo development, and the Reactome category of nervous system development 33 had strong positive associations with LOF intolerance, suggesting that developmental genes are highly intolerant to LOF mutations.…”
Section: Resultssupporting
confidence: 82%
“…The expected number of LOF variants in each gene was from a neutral mutation model developed by gnomAD 5 , which took into account the impact of trinucleotide sequence context, CpG methylation level, local mutation rate, and site-wise sequencing coverage on the occurrence of variants. The 18 genomic features included five epigenomic features 20 , four gene categories associated with developmental processes 25 , three protein annotations [26][27][28] , two phastCons conservation scores 15,29 , two gene expression features 30,31 , the promoter CpG density 15 , and the UNEECON-G score 32 . A detailed description of these genomic features is available in Supplementary Table 1.…”
Section: Resultsmentioning
confidence: 99%
“…In this work, we have introduced the MK regression, the first evolutionary model for jointly estimating the effects of multiple, potentially correlated genomic features on the rate of adaptive substitutions. Based on similar ideas, we have previously developed statistical approaches to infer negative selection on genetic variants [54][55][56]. Thus, unifying generalized linear models and evolutionary models may be a powerful strategy to address a variety of statistical problems in evolutionary biology.…”
Section: Discussionmentioning
confidence: 99%
“…Also, we utilized interspecies divergence in the chimpanzee lineage to construct a map of local mutation rates in the panTro4 assembly. To do so, we converted putatively neutral regions defined in [56] from hg19 to panTro4 using liftOver. Then, we computed the density of chimpanzee-specific substitutions in putatively neutral regions using a 100Kb non-overlapping sliding window, which was used as a proxy of local mutation rates.…”
Section: Genomic Featuresmentioning
confidence: 99%
“…In clinical genetic testing, many of missense variants in well-established risk genes are classified as variants of uncertain significance, unless they are highly recurrent in patients. Previously published in silico prediction methods have facilitated the interpretation of missense variants, such as CADD 8 , VEST3 9 , MetaSVM 10 , M-CAP 11 , REVEL 12 , PrimateAI 13 , and UNEECON 14 . However, based on recent de novo mutation data, they all have limited performance with low positive predictive value (Supplementary Data 1 ), especially in non-constrained genes (defined as ExAC 15 pLI < 0.5).…”
Section: Introductionmentioning
confidence: 99%