Polygenic scores in biomedical research

Kullo, Iftikhar J.; Lewis, Cathryn M.; Inouye, Michael; Ar, Martin; Ripatti, Samuli; Chatterjee, Nilanjan

doi:10.1038/s41576-022-00470-z

Cited by 98 publications

(65 citation statements)

References 58 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Polygenic scores (PGS)-estimates of an individual's genetic predisposition for complex traits/diseases (i.e. genetic value)-are a promising application of large-scale genome-wide association studies (GWAS) to personalized genomic medicine [1][2][3][4] , disease risk prediction and prevention [5][6][7][8] . The portability of PGS across different ancestry and socio-demographic groups is limited due to Euro-centric sampling of GWAS data coupled with differences in linkage disequilibrium (LD), minor allele frequency (MAF) and/or disease genetic architecture 3,[9][10][11][12][13] , which poses a critical equity barrier that has prevented widespread adoption of PGS for personalized medicine.…”

Section: Introductionmentioning

confidence: 99%

“…genetic value)-are a promising application of large-scale genome-wide association studies (GWAS) to personalized genomic medicine [1][2][3][4] , disease risk prediction and prevention [5][6][7][8] . The portability of PGS across different ancestry and socio-demographic groups is limited due to Euro-centric sampling of GWAS data coupled with differences in linkage disequilibrium (LD), minor allele frequency (MAF) and/or disease genetic architecture 3,[9][10][11][12][13] , which poses a critical equity barrier that has prevented widespread adoption of PGS for personalized medicine. For example, PGS are significantly more accurate for individuals of European ancestries as compared to other genetic ancestries 10,14 ; furthermore, PGS accuracy varies across socio-genomic features (e.g., sex, age and social economic status) 11 , thus complicating interpretability of PGS across groups with different environmental exposures.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Polygenic scoring accuracy varies across the genetic ancestry continuum in all human populations

Ding

Hou

Xu³

et al. 2022

Preprint

View full text Add to dashboard Cite

Polygenic scores (PGS) have limited portability across different groupings of individuals (e.g., by genetic ancestries and/or social determinants of health), preventing their equitable use. PGS portability has typically been assessed using a single aggregate population-level statistic (e.g., R2), ignoring inter-individual variation within the population. Here we evaluate PGS accuracy at individual-level resolution, independent of its annotated genetic ancestries. We show that PGS accuracy varies between individuals across the genetic ancestry continuum in all ancestries, even within traditionally "homogeneous" genetic ancestry clusters. Using a large and diverse Los Angeles biobank (ATLAS, N= 36,778) along with the UK Biobank (UKBB, N= 487,409), we show that PGS accuracy decreases along a continuum of genetic ancestries in all considered populations and the trend is well-captured by a continuous measure of genetic distance (GD) from the PGS training data; Pearson correlation of -0.95 between GD and PGS accuracy averaged across 84 traits. When applying PGS models trained in UKBB "white British" individuals to European-ancestry individuals of ATLAS, individuals in the highest GD decile have 14% lower accuracy relative to the lowest decile; notably the lowest GD decile of Hispanic/Latino American ancestry individuals showed similar PGS performance as the highest GD decile of European ancestry ATLAS individuals. GD is significantly correlated with PGS estimates themselves for 82 out of 84 traits, further emphasizing the importance of incorporating the continuum of genetic ancestry in PGS interpretation. Our results highlight the need for moving away from discrete genetic ancestry clusters towards the continuum of genetic ancestries when considering PGS and their applications.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Polygenic scoring accuracy varies across the genetic ancestry continuum in all human populations

Ding

Hou

Xu³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Polygenic scores have been extensively studied for disease risk assessment [8][9][10][11][12][13][14][15] . However, the polygenic scores are often inadequate for accurate disease prediction, which is partially due to: 1) complex diseases are caused not only by genetic factors but by a combination of genetic, environmental, and lifestyle factors; and 2) the linear polygenic models lack the expressive power and model capacity to capture the complex non-linear, non-additive interactions that are inherent in the genotype-phenotype relationship.…”

Section: Introductionmentioning

confidence: 99%

Deep transfer learning provides aParetoimprovement for multi-ancestral clinico-genomic prediction of diseases

Gao

Cui

2022

Preprint

View full text Add to dashboard Cite

Accurate predisposition assessment is essential for the prevention and early detection of diseases. Polygenic scores and machine learning models have been developed for disease prediction based on genetic variants and other risk factors. However, over 80% of genomic data were acquired from individuals of European descent. Other ethnic groups comprise the vast majority of the world population and have a severe data disadvantage. Due to the lack of suitable training data, clinico-genomic risk prediction is less accurate for the non-European population. Here we employ a transfer learning strategy to improve the clinico-genomic prediction of disease occurrence for data-disadvantaged populations. Our multiethnic machine learning experiments on real and synthetic datasets show that transfer learning can significantly improve disease prediction accuracy for data-disadvantaged populations. Under the transfer learning scheme, the prediction accuracy for the data-disadvantaged populations can be improved without compromising the prediction accuracy for other populations. Therefore, transfer learning provides a Pareto improvement toward equitable machine learning for genomic medicine.

show abstract

“…PGSs combine the effect of many genetic variants on a phenotype, which can either be qualitative (e.g., disease status) or quantitative (e.g., blood biomarker level). While ethical and societal implications need careful consideration before widespread deployment 17,18 , PGSs are increasingly being considered for their clinical utility, e.g., in the context of precision medicine 17,[19][20][21][22][23] . Numerous methods exist for computing trait and disease PGSs on individual-level data [24][25][26][27][28] and summary statistics [29][30][31][32][33][34] , but they generally only model additive relationships between genotype and target.…”

Section: Introductionmentioning

confidence: 99%

Improved prediction of blood biomarkers using deep learning

Sigurdsson

Ravn

Winther

et al. 2022

Preprint

View full text Add to dashboard Cite

Blood and urine biomarkers are an essential part of modern medicine, not only for diagnosis, but also for their direct influence on disease. Many biomarkers have a genetic component, and they have been studied extensively with genome-wide association studies (GWAS) and methods that compute polygenic scores (PGSs). However, these methods generally assume both an additive allelic model and an additive genetic architecture for the target outcome, and thereby risk not capturing non-linear allelic effects nor epistatic interactions. Here, we trained and evaluated deep-learning (DL) models for PGS prediction of 34 blood and urine biomarkers in the UK Biobank cohort, and compared them to linear methods. For lipid traits, the DL models greatly outperformed the linear methods, which we found to be consistent across diverse populations. Furthermore, the DL models captured non-linear effects in covariates, non-additive genotype (allelic) effects, and epistatic interactions between SNPs. Finally, when using only genome-wide significant SNPs from GWAS, the DL models performed equally well or better for all 34 traits tested. Our findings suggest that DL can serve as a valuable addition to existing methods for genotype-phenotype modelling in the era of increasing data availability.

show abstract

Polygenic scores in biomedical research

Cited by 98 publications

References 58 publications

Polygenic scoring accuracy varies across the genetic ancestry continuum in all human populations

Polygenic scoring accuracy varies across the genetic ancestry continuum in all human populations

Deep transfer learning provides aParetoimprovement for multi-ancestral clinico-genomic prediction of diseases

Improved prediction of blood biomarkers using deep learning

Contact Info

Product

Resources

About