Polygenic risk scores (PRSs) have wide applications in human genetics research, but often include tuning parameters which are difficult to optimize in practice due to limited access to individual-level data. Here, we introduce PUMAS, a novel method to fine-tune PRS models using summary statistics from genome-wide association studies (GWASs). Through extensive simulations, external validations, and analysis of 65 traits, we demonstrate that PUMAS can perform various model-tuning procedures using GWAS summary statistics and effectively benchmark and optimize PRS models under diverse genetic architecture. Furthermore, we show that fine-tuned PRSs will significantly improve statistical power in downstream association analysis.
Marginal effect estimates in genome-wide association studies (GWAS) are mixtures of direct and indirect genetic effects. Existing methods to dissect these effects require family-based, individual-level genetic, and phenotypic data with large samples, which is difficult to obtain in practice. Here, we propose a statistical framework to estimate direct and indirect genetic effects using summary statistics from GWAS conducted on own and offspring phenotypes. Applied to birth weight, our method showed nearly identical results with those obtained using individual-level data. We also decomposed direct and indirect genetic effects of educational attainment (EA), which showed distinct patterns of genetic correlations with 45 complex traits. The known genetic correlations between EA and higher height, lower body mass index, less-active smoking behavior, and better health outcomes were mostly explained by the indirect genetic component of EA. In contrast, the consistently identified genetic correlation of autism spectrum disorder (ASD) with higher EA resides in the direct genetic component. A polygenic transmission disequilibrium test showed a significant overtransmission of the direct component of EA from healthy parents to ASD probands. Taken together, we demonstrate that traditional GWAS approaches, in conjunction with offspring phenotypic data collection in existing cohorts, could greatly benefit studies on genetic nurture and shed important light on the interpretation of genetic associations for human complex traits.
Polygenic risk scores (PRSs) have wide applications in human genetics research. Notably, most PRS models include tuning parameters which improve predictive performance when properly selected. However, existing model-tuning methods require validation data that is independent with both training and testing samples. These data rarely exist in practice, creating a significant gap between PRS methodology and applications. Here, we introduce PUMAS, a novel method to finetune PRS models using summary statistics from genome-wide association studies (GWASs). Through extensive simulations, external validations, and analysis of 65 GWAS traits, we demonstrate that PUMAS can perform a variety of model-tuning procedures (e.g. cross-validation) using GWAS summary statistics and can effectively benchmark and optimize PRS models under diverse genetic architecture. Applied to 211 neuroimaging traits and Alzheimer's disease, we show that fine-tuned PRSs will improve statistical power in association analysis. We believe our method resolves a fundamental problem without a current solution and will greatly benefit genetic prediction applications.
for the Cooperative Studies Program (CSP) #572 and Million Veteran Program (MVP)IMPORTANCE Serious mental illnesses, including schizophrenia, bipolar disorder, and depression, are heritable, highly multifactorial disorders and major causes of disability worldwide.OBJECTIVE To benchmark the penetrance of current neuropsychiatric polygenic risk scores (PRSs) in the Veterans Health Administration health care system and to explore associations between PRS and broad categories of human disease via phenome-wide association studies.DESIGN, SETTING, AND PARTICIPANTS Extensive Veterans Health Administration's electronic health records were assessed from October 1999 to January 2021, and an embedded cohort of 9378 individuals with confirmed diagnoses of schizophrenia or bipolar 1 disorder were found. The performance of schizophrenia, bipolar disorder, and major depression PRSs were compared in participants of African or European ancestry in the Million Veteran Program (approximately 400 000 individuals), and associations between PRSs and 1650 disease categories based on ICD-9/10 billing codes were explored. Last, genomic structural equation modeling was applied to derive novel PRSs indexing common and disorder-specific genetic factors. Analysis took place from January 2021 to January 2022. MAIN OUTCOMES AND MEASURESDiagnoses based on in-person structured clinical interviews were compared with ICD-9/10 billing codes. PRSs were constructed using summary statistics from genome-wide association studies of schizophrenia, bipolar disorder, and major depression. RESULTSOf 707 299 enrolled study participants, 459 667 were genotyped at the time of writing; 84 806 were of broadly African ancestry (mean [SD] age, 58 [12.1] years) and 314 909 were of broadly European ancestry (mean [SD] age, 66.4 [13.5] years). Among 9378 individuals with confirmed diagnoses of schizophrenia or bipolar 1 disorder, 8962 (95.6%) were correctly identified using ICD-9/10 codes (2 or more). Among those of European ancestry, PRSs were robustly associated with having received a diagnosis of schizophrenia (odds ratio [OR], 1.81 [95% CI, 1.76-1.87]; P < 10 −257 ) or bipolar disorder (OR, 1.42 [95% CI,; P < 10 −295 ). Corresponding effect sizes in participants of African ancestry were considerably smaller for schizophrenia (OR, 1.35 [95% CI,; P < 10 −38 ) and bipolar disorder (OR, 1.16 [95% CI,; P < 10 −10 ). Neuropsychiatric PRSs were associated with increased risk for a range of psychiatric and physical health problems. CONCLUSIONS AND RELEVANCEUsing diagnoses confirmed by in-person structured clinical interviews and current neuropsychiatric PRSs, the validity of an electronic health records-based phenotyping approach in US veterans was demonstrated, highlighting the potential of PRSs for disentangling biological and mediated pleiotropy.
The integration of genetic data within large-scale social and health surveys provides new opportunities to test long standing theories of parental investments in children and within-family inequality. Genetic predictors, called polygenic scores, allow novel assessments of young children's abilities that are uncontaminated by parental investments, and family-based samples allow indirect tests of whether children's abilities are reinforced or compensated. We use over 16,000 sibling pairs from the UK Biobank to test whether the relative ranking of siblings' polygenic scores for educational attainment is consequential for actual attainments. We find strong evidence of compensatory processes, on average, where the association between genotype and phenotype of educational attainment is reduced by over 20% for the higher-ranked sibling compared to the lower-ranked sibling. These effects are most pronounced in high socioeconomic status areas. We find no evidence that similar processes hold in the case of height or for relatives who are not full biological siblings (e.g. cousins). Our results provide a new use of polygenic scores to understand processes that generate within-family inequalities and also suggest important caveats to causal interpretations the effects of polygenic scores using siblingdifference designs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.