2021
DOI: 10.1016/j.ajhg.2021.04.014
|View full text |Cite
|
Sign up to set email alerts
|

Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction

Abstract: Summary The accuracy of polygenic risk scores (PRSs) to predict complex diseases increases with the training sample size. PRSs are generally derived based on summary statistics from large meta-analyses of multiple genome-wide association studies (GWASs). However, it is now common for researchers to have access to large individual-level data as well, such as the UK Biobank data. To the best of our knowledge, it has not yet been explored how best to combine both types of data (summary statistics and i… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
19
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

3
3

Authors

Journals

citations
Cited by 28 publications
(21 citation statements)
references
References 64 publications
1
19
0
Order By: Relevance
“…First, we benchmarked the lasso multi-PGS against the single PGS trained on the external largest available GWAS summary statistics and the single PGS trained on the individual-level genotype and phenotype information (BLUP PGS, details in methods section). As shown previously 31 , the BLUP PGS vs. single GWAS PGS varied in the relative proportion of variance explained according to the psychiatric disorder, as they are largely dependent on the training sample sizes and genetic correlation (Figure 4A). We observed these large differences also in terms of log OR of separating the top 10% to the bottom 10% of the sample (Figure 4B).…”
Section: Resultssupporting
confidence: 66%
See 1 more Smart Citation
“…First, we benchmarked the lasso multi-PGS against the single PGS trained on the external largest available GWAS summary statistics and the single PGS trained on the individual-level genotype and phenotype information (BLUP PGS, details in methods section). As shown previously 31 , the BLUP PGS vs. single GWAS PGS varied in the relative proportion of variance explained according to the psychiatric disorder, as they are largely dependent on the training sample sizes and genetic correlation (Figure 4A). We observed these large differences also in terms of log OR of separating the top 10% to the bottom 10% of the sample (Figure 4B).…”
Section: Resultssupporting
confidence: 66%
“…The process has already been described in Albiñana et al . 2021 31 . Using the set of 20 PCs, we defined genetically homogeneous individuals as having <4.5 log distance units to the multidimensional center of the 20 PCs (calculated using the function dist_ogk from the R package bigutilsr 39,40 ).…”
Section: Methodsmentioning
confidence: 99%
“…54 Moreover, we do not use external summary statistics, which means that polygenic scores derived from large GWAS meta-analyses would probably outperform the ones we derived here. Nevertheless, Albin ˜ana et al 55 have shown that an efficient strategy to improve predictive ability of polygenic scores consists in combining two different polygenic scores, one derived using external summary statistics and another one derived using internal individual-level data. Therefore, the polygenic scores we derived here could be combined with polygenic scores derived using external summary statistics; we will release these PGSs publicly and share them in databases such as the PGS Catalog and the Cancer-PRSweb.…”
Section: Discussionmentioning
confidence: 99%
“…Although primarily intended for GWAS, BOLT-LMM can and has been used in multiple studies for deriving polygenic risk scores; to our knowledge the methods and corresponding results have been described in at least 6 papers by other investigators (Albiñana et al, 2021;Loh et al, 2015;Loh et al, 2018;Loh et al, 2020;Speed & Balding, 2014;Weissbrod et al, 2021;Yang et al, 2011;Zhang et al, 2020), including the original BOLT-LMM papers by Loh et al (2015) and Loh et al (2018) and a third more recently (Loh et al, 2020). In addition, we recently described and evaluated the BOLT-LMM PRS method that we used (Albiñana et al, 2021). In brief, one can obtain the prediction weights using a BOLT-LMM command-line option (--predBetasFile; not reflected in the user manual but can be found on the software's --helpFull flag), and use them as variant weights for genomic prediction (polygenic scores).…”
Section: Methodsmentioning
confidence: 99%
“…The ASD PRS was derived using BOLT-LMM 20 for risk prediction. Although primarily intended for GWAS, BOLT-LMM can and has been used in multiple studies for deriving polygenic risk scores; to our knowledge the methods and corresponding results have been described in at least 6 papers by other investigators (Albiñana et al, 2021;Loh et al, 2015;Loh et al, 2018;Loh et al, 2020;Speed & Balding, 2014;Weissbrod et al, 2021;Yang et al, 2011;Zhang et al, 2020), including the original BOLT-LMM papers by Loh et al (2015) and Loh et al (2018) and a third more recently (Loh et al, 2020). In addition, we recently described and evaluated the BOLT-LMM PRS method that we used (Albiñana et al, 2021).…”
Section: Methodsmentioning
confidence: 99%