Predicting individual quantitative trait phenotypes from high-resolution genomic polymorphism data is important for personalized medicine in humans, plant and animal breeding, and adaptive evolution. However, this is difficult for populations of unrelated individuals when the number of causal variants is low relative to the total number of polymorphisms and causal variants individually have small effects on the traits. We hypothesized that mapping molecular polymorphisms to genomic features such as genes and their gene ontology categories could increase the accuracy of genomic prediction models. We developed a genomic feature best linear unbiased prediction (GFBLUP) model that implements this strategy and applied it to three quantitative traits (startle response, starvation resistance, and chill coma recovery) in the unrelated, sequenced inbred lines of the Drosophila melanogaster Genetic Reference Panel. Our results indicate that subsetting markers based on genomic features increases the predictive ability relative to the standard genomic best linear unbiased prediction (GBLUP) model. Both models use all markers, but GFBLUP allows differential weighting of the individual genetic marker relationships, whereas GBLUP weighs the genetic marker relationships equally. Simulation studies show that it is possible to further increase the accuracy of genomic prediction for complex traits using this model, provided the genomic features are enriched for causal variants. Our GFBLUP model using prior information on genomic features enriched for causal variants can increase the accuracy of genomic predictions in populations of unrelated individuals and provides a formal statistical framework for leveraging and evaluating information across multiple experimental studies to provide novel insights into the genetic architecture of complex traits.
Background
- There is considerable interest in whether genetic data can be used to improve standard cardiovascular disease risk calculators, as the latter are routinely used in clinical practice to manage preventative treatment.
Methods
- Using the UK Biobank (UKB) resource, we developed our own polygenic risk score (PRS) for coronary artery disease (CAD). We used an additional 60,000 UKB individuals to develop an integrated risk tool (IRT) that combined our PRS with established risk tools (either the American Heart Association/American College of Cardiology's Pooled Cohort Equations (PCE) or UK's QRISK3), and we tested our IRT in an additional, independent, set of 186,451 UKB individuals.
Results
- The novel CAD PRS shows superior predictive power for CAD events, compared to other published PRSs and is largely uncorrelated with PCE and QRISK3. When combined with PCE into an integrated risk tool, it has superior predictive accuracy. Overall, 10.4% of incident CAD cases were misclassified as low risk by PCE and correctly classified as high risk by the IRT, compared to 4.4% misclassified by the IRT and correctly classified by PCE. The overall net reclassification improvement for the IRT was 5.9% (95% CI 4.7-7.0). When individuals were stratified into age-by-sex subgroups the improvement was larger for all subgroups (range 8.3%-15.4%), with best performance in 40-54yo men (15.4%, 95% CI 11.6-19.3). Comparable results were found using a different risk tool (QRISK3), and also a broader definition of cardiovascular disease. Use of the IRT is estimated to avoid up to 12,000 deaths in the USA over a 5-year period.
Conclusions
- An integrated risk tool that includes polygenic risk outperforms current risk stratification tools and offers greater opportunity for early interventions. Given the plummeting costs of genetic tests, future iterations of CAD risk tools would be enhanced with the addition of a person's polygenic risk.
The deleterious consequences of inbreeding, especially in the form of inbreeding depression, are well known. However, little is known about how inbreeding affects genome-wide gene expression. Here, we show that inbreeding changes transcription levels for a number of genes. Gene expression profiles of Drosophila melanogaster lines inbred to F 0.67 at different rates changed relative to those of noninbred lines, but the rate of inbreeding did not significantly affect gene expression patterns. Genes being differentially expressed with inbreeding are disproportionately involved in metabolism and stress responses, suggesting that inbreeding acts like an environmental stress factor.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.