2018
DOI: 10.1534/genetics.118.301267
|View full text |Cite|
|
Sign up to set email alerts
|

Accurate Genomic Prediction of Human Height

Abstract: We construct genomic predictors for heritable but extremely complex human quantitative traits (height, heel bone density, and educational attainment) using modern methods in high dimensional statistics (i.e., machine learning). The constructed predictors explain, respectively, $40, 20, and 9% of total variance for the three traits, in data not used for training. For example, predicted heights correlate $0.65 with actual height; actual heights of most individuals in validation samples are within a few centimete… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
131
0
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 147 publications
(143 citation statements)
references
References 26 publications
4
131
0
1
Order By: Relevance
“…Previous studies have mostly focused on improving parameter estimation, through increasing sample size and methodological improvement. For example, increasing sample size substantially increased accuracy of polygenic prediction of height within individuals of European ancestry (Lello et al 2018). Inclusion of samples of different backgrounds in the training data also helped (Martin et al 2019) ( Figure S2).…”
Section: Discussionmentioning
confidence: 99%
“…Previous studies have mostly focused on improving parameter estimation, through increasing sample size and methodological improvement. For example, increasing sample size substantially increased accuracy of polygenic prediction of height within individuals of European ancestry (Lello et al 2018). Inclusion of samples of different backgrounds in the training data also helped (Martin et al 2019) ( Figure S2).…”
Section: Discussionmentioning
confidence: 99%
“…Recently Lello et al (2018) apply a lasso based method to predict height and other phenotypes on the UK Biobank. Instead of fitting on all QC-satisfied SNPs (as stated in Section 4.1), they pre-screen 50K or 100K most significant SNPs in terms of p-value and apply lasso on that set only.…”
Section: Methodsmentioning
confidence: 99%
“…While GWAS focus on identifying SNPs that may be marginally associated with the outcome using univariate tests, we would like to find relevant SNPs in a multivariate prediction model using the lasso. A recent study (Lello et al, 2018) fits the lasso to a similar subset of the dataset after one-shot univariate p-value screening and suggests improvement in explaining the variation in the phenotypes. However, the left-out variants with relatively weak marginal association may still provide additional predictive power in a multivariate environment.…”
Section: Application: Uk Biobankmentioning
confidence: 99%
See 1 more Smart Citation
“…There is a range of machine learning analyses in use in molecular epidemiology. For example, lasso regression analysis of GWAS data is being used to develop prediction algorithms [69], although these have not yet been applied to studies of aging-related phenotypes; elastic-net regression analysis of epigenomic data is being used to develop novel measurements of the aging rate [24]; and neural-net analysis of a range of data types, including transcriptomic data, is being used to develop aging biomarkers and identify drug targets [70]. As machine learning approaches continue to mature, studies will be needed to compare methods and define best practices for implementation.…”
Section: New Developments In Molecular Epidemiology Of Agingmentioning
confidence: 99%