BackgroundIntegrating rare variation from trio family and case–control studies has successfully implicated specific genes contributing to risk of neurodevelopmental disorders (NDDs) including autism spectrum disorders (ASD), intellectual disability (ID), developmental disorders (DDs), and epilepsy (EPI). For schizophrenia (SCZ), however, while sets of genes have been implicated through the study of rare variation, only two risk genes have been identified.MethodsWe used hierarchical Bayesian modeling of rare-variant genetic architecture to estimate mean effect sizes and risk-gene proportions, analyzing the largest available collection of whole exome sequence data for SCZ (1,077 trios, 6,699 cases, and 13,028 controls), and data for four NDDs (ASD, ID, DD, and EPI; total 10,792 trios, and 4,058 cases and controls).ResultsFor SCZ, we estimate there are 1,551 risk genes. There are more risk genes and they have weaker effects than for NDDs. We provide power analyses to predict the number of risk-gene discoveries as more data become available. We confirm and augment prior risk gene and gene set enrichment results for SCZ and NDDs. In particular, we detected 98 new DD risk genes at FDR < 0.05. Correlations of risk-gene posterior probabilities are high across four NDDs (ρ>0.55), but low between SCZ and the NDDs (ρ<0.3). An in-depth analysis of 288 NDD genes shows there is highly significant protein–protein interaction (PPI) network connectivity, and functionally distinct PPI subnetworks based on pathway enrichment, single-cell RNA-seq cell types, and multi-region developmental brain RNA-seq.ConclusionsWe have extended a pipeline used in ASD studies and applied it to infer rare genetic parameters for SCZ and four NDDs (https://github.com/hoangtn/extTADA). We find many new DD risk genes, supported by gene set enrichment and PPI network connectivity analyses. We find greater similarity among NDDs than between NDDs and SCZ. NDD gene subnetworks are implicated in postnatally expressed presynaptic and postsynaptic genes, and for transcriptional and post-transcriptional gene regulation in prenatal neural progenitor and stem cells.Electronic supplementary materialThe online version of this article (doi:10.1186/s13073-017-0497-y) contains supplementary material, which is available to authorized users.
Integrating rare variation from family and case/control studies has successfully implicated specific genes contributing to risk of autism spectrum disorder (ASD). In schizophrenia (SCZ), however, while sets of genes have been implicated through study of rare variation, very few individual risk genes have been identified. Here, we apply hierarchical Bayesian modeling of rare variation in schizophrenia and describe the proportion of risk genes and distribution of risk variant effect sizes across multiple variant annotation categories. Briefly, we developed a pipeline based on the previous work used in ASD studies to jointly estimate genetic parameters for one or multiple combined populations of any disease. We applied this method to the largest available collection for rare variants in schizophrenia (1,077 families, 6,699 cases and 13,028 controls). We defined five variant annotation categories: disruptive (nonsense, frameshift, essential splice site mutations), damaging (predicting damaging by seven algorithms), silent-FCPk (silent mutations within frontal cortex-derived DHS peaks) de novo mutations, and disruptive and damaging missense case/control singletons. We estimated that 8.01% of genes are risk genes (95% credible interval, CI, 4.59-12.9%), with mean effect sizes (95% CIs) of 12.25 (4.8-22.22) for disruptive de novos, 1.44 (1-3.16) for missense damaging de novos, and 1.22 (1-2.16) for silentFCPk de novos. The mean effect sizes of damaging and disruptive singleton variants for three case-control populations were 2.09 (1.04-3.54), 2.44 (1.04, 5.73) and 1.04 (1-1.19) respectively. Our analysis identified only two known SCZ risk genes with FDR<0.05: SETD1A and TAF13; and two other genes with FDR < 0.1: RB1CC1 and PRRC2A. We further used FDRs to directly analyze candidate gene sets for the enrichment of Bayesian support. Significant enrichments were observed for essential genes, which were found enriched among autism genes in a recent study, and central nervous system (CNS) related genes, in addition to gene sets previously found to be enriched (including in these data). We conduct power analyses under our inferred model for SCZ, estimating the number of risk gene discoveries as more data become available, and quantifying the greater value of case/control over trio samples for novel rare variant risk gene discovery. We also applied the method to four other neurodevelopmental disorders: autism spectrum disorder (ASD), intellectual disorder (ID), developmental disorder (DD) and epilepsy (EPI), in total 10,792 families, and 4,058 cases and controls. The predicted proportions of risk genes in these diseases were smaller than that in SCZ, 4.6% in ASD, and < 3% for the other disorders. We report 164 and 58 genes with FDR < 0.05 for DD and ID, respectively, 101 and 15 of which are novel. Overall, replication of previous results confirms the robustness of our approach, and our method is able to identify novel risk genes for SCZ as well as for other diseases. 3
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.