Genetic susceptibility factors behind psychiatric disorders typically contribute small effects individually. A possible explanation for the missing heritability is that the effects of common variants are not only polygenic but also non-additive, appearing only when interactions within large groups are taken into account. Here, we tested this hypothesis for schizophrenia (SZ) and bipolar disorder (BP) disease risks, and identified genetic factors shared with posttraumatic stress disorder (PTSD). When considered independently, few single-nucleotide polymorphisms (SNPs) reached genome-wide significance. In contrast, when SNPs were selected in groups (containing up to thousands each) and the collective effects of all interactions were estimated, the association strength for SZ/BP rose dramatically with a combined sample size of 7187 cases and 8309 controls. We identified a large number of genes and pathways whose association was significant only when interaction effects were included. The gene with highest association was CSMD1, which encodes a negative regulator of complement activation. Pathways for glycosaminoglycan (GAG) synthesis exhibited strong association in multiple contexts. Taken together, highly associated pathways suggested a pathogenesis mechanism where maternal immune activation causes disruption of neurogenesis (compounded by impaired cell cycle, DNA repair and neuronal migration) and deficits in cortical interneurons, leading to symptoms triggered by synaptic pruning. Increased risks arise from GAG deficiencies causing complement activation and excessive microglial action. Analysis of PTSD data sets suggested an etiology common to SZ/BP: interneuron deficiency can also lead to impaired control of fear responses triggered by trauma. We additionally found PTSD risk factors affecting synaptic plasticity and fatty acid signaling, consistent with the fear extinction model. Our results suggest that much of the missing heritability of psychiatric disorders resides in non-additive interaction effects.
BackgroundResearchers have previously developed a multitude of methods designed to identify biological pathways associated with specific clinical or experimental conditions of interest, with the aim of facilitating biological interpretation of high-throughput data. Before practically applying such pathway analysis (PA) methods, we must first evaluate their performance and reliability, using datasets where the pathways perturbed by the conditions of interest have been well characterized in advance. However, such ‘ground truths’ (or gold standards) are often unavailable. Furthermore, previous evaluation strategies that have focused on defining ‘true answers’ are unable to systematically and objectively assess PA methods under a wide range of conditions.ResultsIn this work, we propose a novel strategy for evaluating PA methods independently of any gold standard, either established or assumed. The strategy involves the use of two mutually complementary metrics, recall and discrimination. Recall measures the consistency of the perturbed pathways identified by applying a particular analysis method to an original large dataset and those identified by the same method to a sub-dataset of the original dataset. In contrast, discrimination measures specificity—the degree to which the perturbed pathways identified by a particular method to a dataset from one experiment differ from those identifying by the same method to a dataset from a different experiment. We used these metrics and 24 datasets to evaluate six widely used PA methods. The results highlighted the common challenge in reliably identifying significant pathways from small datasets. Importantly, we confirmed the effectiveness of our proposed dual-metric strategy by showing that previous comparative studies corroborate the performance evaluations of the six methods obtained by our strategy.ConclusionsUnlike any previously proposed strategy for evaluating the performance of PA methods, our dual-metric strategy does not rely on any ground truth, either established or assumed, of the pathways perturbed by a specific clinical or experimental condition. As such, our strategy allows researchers to systematically and objectively evaluate pathway analysis methods by employing any number of datasets for a variety of conditions.Electronic supplementary materialThe online version of this article (10.1186/s12859-017-1866-7) contains supplementary material, which is available to authorized users.
BackgroundGenome-wide association studies provide important insights to the genetic component of disease risks. However, an existing challenge is how to incorporate collective effects of interactions beyond the level of independent single nucleotide polymorphism (SNP) tests. While methods considering each SNP pair separately have provided insights, a large portion of expected heritability may reside in higher-order interaction effects.ResultsWe describe an inference approach (discrete discriminant analysis; DDA) designed to probe collective interactions while treating both genotypes and phenotypes as random variables. The genotype distributions in case and control groups are modeled separately based on empirical allele frequency and covariance data, whose differences yield disease risk parameters. We compared pairwise tests and collective inference methods, the latter based both on DDA and logistic regression. Analyses using simulated data demonstrated that significantly higher sensitivity and specificity can be achieved with collective inference in comparison to pairwise tests, and with DDA in comparison to logistic regression. Using age-related macular degeneration (AMD) data, we demonstrated two possible applications of DDA. In the first application, a genome-wide SNP set is reduced into a small number (∼100) of variants via filtering and SNP pairs with significant interactions are identified. We found that interactions between SNPs with highest AMD association were epigenetically active in the liver, adipocytes, and mesenchymal stem cells. In the other application, multiple groups of SNPs were formed from the genome-wide data and their relative strengths of association were compared using cross-validation. This analysis allowed us to discover novel collections of loci for which interactions between SNPs play significant roles in their disease association. In particular, we considered pathway-based groups of SNPs containing up to ∼10, 000 variants in each group. In addition to pathways related to complement activation, our collective inference pointed to pathway groups involved in phospholipid synthesis, oxidative stress, and apoptosis, consistent with the AMD pathogenesis mechanism where the dysfunction of retinal pigment epithelium cells plays central roles.ConclusionsThe simultaneous inference of collective interaction effects within a set of SNPs has the potential to reveal novel aspects of disease association.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-2871-3) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.