2019
DOI: 10.1101/847020
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A principal component approach to improve association testing with polygenic risk scores

Abstract: Polygenic risk scores (PRSs) have become an increasingly popular approach for demonstrating polygenic influences on complex traits and for establishing common polygenic signals between different traits. PRSs are typically constructed using pruning and thresholding (P+T), but the best choice of parameters is uncertain; thus multiple settings are used and the best is chosen. This optimization can lead to inflated type I error. To correct this, permutation procedures can be used but they can be computationally in… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
23
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(23 citation statements)
references
References 26 publications
(35 reference statements)
0
23
0
Order By: Relevance
“…We then performed a principal component analysis (PCA) on the resulting PRS and used the first PRS-PCA in subsequent association tests. 40 The PCA reweights the variants included in the PRS to achieve maximum variation across all the p t. 40 This PRS-PCA approach avoids the optimization step in standard pruning and thresholding models to determine the optimal p t , which can inflate type 1 error and result in overfitting. 40 As a sensitivity analysis, we further excluded variants located ±250 kb of the APOE e4 defining SNP, rs429358.…”
Section: Polygenic Risk Scoresmentioning
confidence: 99%
See 1 more Smart Citation
“…We then performed a principal component analysis (PCA) on the resulting PRS and used the first PRS-PCA in subsequent association tests. 40 The PCA reweights the variants included in the PRS to achieve maximum variation across all the p t. 40 This PRS-PCA approach avoids the optimization step in standard pruning and thresholding models to determine the optimal p t , which can inflate type 1 error and result in overfitting. 40 As a sensitivity analysis, we further excluded variants located ±250 kb of the APOE e4 defining SNP, rs429358.…”
Section: Polygenic Risk Scoresmentioning
confidence: 99%
“…40 The PCA reweights the variants included in the PRS to achieve maximum variation across all the p t. 40 This PRS-PCA approach avoids the optimization step in standard pruning and thresholding models to determine the optimal p t , which can inflate type 1 error and result in overfitting. 40 As a sensitivity analysis, we further excluded variants located ±250 kb of the APOE e4 defining SNP, rs429358. The association between each exposure PRS and AD was evaluated using logistic regression adjusting for age, sex, APOE ε4 dose, and 10 principal components.…”
Section: Polygenic Risk Scoresmentioning
confidence: 99%
“…All PGS were regressed on genotype batch and 10 principal components to account for possible structural artefacts in the data. Next, in order to have a single polygenic predictor per neurodevelopmental trait in the analyses, we extracted the first principal component from analyses of scores at all 10 thresholds for a given phenotypean approach recently shown by Coombes, Ploner, Bergen, and Biernacka (2020) to maximise prediction while reducing the risk of over-fitting. Full details of the parameters used in generating the PGS and correlations between the PGS at each threshold and the principal component score (henceforth: the PGS) used in these analyses are presented in online Supplementary eMethods 2.…”
Section: Polygenic Scoresmentioning
confidence: 99%
“…Two further methods, SBLUP and SBayesR, can be considered pseudovalidation approaches as they also do not require a tuning sample to identify optimal parameters. Rather than selecting a single tuning parameter, some studies have suggested that combining polygenic scores across p-value thresholds whilst taking into account their correlation using either PCA or model stacking can improve prediction 14,15 .…”
Section: Introductionmentioning
confidence: 99%