15Genetic variants in functional regions of the genome are enriched for complex trait heritabil-16 ity. Here, we introduce a new method for polygenic prediction, LDpred-funct, that leverages 17 trait-specific functional enrichments to increase prediction accuracy. We fit priors using the 18 recently developed baseline-LD model, which includes coding, conserved, regulatory and LD-19 related annotations. We analytically estimate posterior mean causal e↵ect sizes and then use 20 cross-validation to regularize these estimates, improving prediction accuracy for sparse architec-21 tures. LDpred-funct attained higher prediction accuracy than other polygenic prediction methods 22 in simulations using real genotypes. We applied LDpred-funct to predict 21 highly heritable traits 23 in the UK Biobank. We used association statistics from British-ancestry samples as training data 24 (avg N =365K) and samples of other European ancestries as validation data (avg N =22K), to 25 minimize confounding. LDpred-funct attained a +9% relative improvement in average predic-26 tion accuracy (avg prediction R 2 =0.145; highest R 2 =0.413 for height) compared to LDpred (the 27 best method that does not incorporate functional information), consistent with simulations. For 28 height, meta-analyzing training data from UK Biobank and 23andMe cohorts (total N =1107K; 29 higher heritability in UK Biobank cohort) increased prediction R 2 to 0.429. Our results show 30 that modeling functional enrichment improves polygenic prediction accuracy, consistent with the 31 functional architecture of complex traits. 32 Genetic variants in functional regions of the genome are enriched for complex trait heritability 1-6 . 34 In this study, we aim to leverage functional enrichment to improve polygenic prediction 7, eral studies have shown that incorporating prior distributions on causal e↵ect sizes can improve 36 prediction accuracy 9-12 , compared to standard Best Linear Unbiased Prediction (BLUP) or Prun-37 ing+Thresholding methods [13][14][15] . Recent e↵orts to incorporate functional information have produced 38 promising results 16,17 , but may be limited by dichotomizing between functional and non-functional 39 variants 16 or restricting their analyses to genotyped variants 17 . 40Here, we introduce a new method, LDpred-funct, for leveraging trait-specific functional enrich-41 ments to increase polygenic prediction accuracy. We fit functional priors using our recently devel-42 oped baseline-LD model 18 , which includes coding, conserved, regulatory and LD-related annotations. 43LDpred-funct first analytically estimates posterior mean causal e↵ect sizes, accounting for functional 44 priors and LD between variants. LDpred-funct then uses cross-validation within validation samples 45 to regularize causal e↵ect size estimates in bins of di↵erent magnitude, improving prediction accuracy 46 for sparse architectures. We show that LDpred-funct attains higher polygenic prediction accuracy 47 than other methods in simulations with real genotypes, analys...
Prioritizing disease-critical cell types by integrating genome-wide association studies (GWAS) with functional data is a fundamental goal. Single-cell chromatin accessibility (scATAC-seq) and gene expression (scRNA-seq) have characterized cell types at high resolution, and early work on integrating GWAS with scRNA-seq has shown promise, but work on integrating GWAS with scATAC-seq has been limited. Here, we identify disease-critical fetal and adult brain cell types by integrating GWAS summary statistics from 28 brain-related diseases and traits (average N=298K) with 3.2 million scATAC-seq and scRNA-seq profiles from 83 cell types. We identified disease-critical fetal (resp. adult) brain cell types for 22 (resp. 23) of 28 traits using scATAC-seq data, and for 8 (resp. 17) of 28 traits using scRNA-seq data. Notable findings using scATAC-seq data included highly significant enrichments of fetal photoreceptor cells for major depressive disorder, fetal ganglion cells for BMI, fetal astrocytes for ADHD, and adult VGLUT2 excitatory neurons for schizophrenia. Our findings improve our understanding of brain-related diseases and traits, and inform future analyses of other diseases/traits.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.