The UK Biobank is a prospective study of 502,543 individuals, combining extensive phenotypic and genotypic data with streamlined access for researchers around the world1. Here we describe the release of exome-sequence data for the first 49,960 study participants, revealing approximately 4 million coding variants (of which around 98.6% have a frequency of less than 1%). The data include 198,269 autosomal predicted loss-of-function (LOF) variants, a more than 14-fold increase compared to the imputed sequence. Nearly all genes (more than 97%) had at least one carrier with a LOF variant, and most genes (more than 69%) had at least ten carriers with a LOF variant. We illustrate the power of characterizing LOF variants in this population through association analyses across 1,730 phenotypes. In addition to replicating established associations, we found novel LOF variants with large effects on disease traits, including PIEZO1 on varicose veins, COL6A1 on corneal resistance, MEPE on bone density, and IQGAP2 and GMPR on blood cell traits. We further demonstrate the value of exome sequencing by surveying the prevalence of pathogenic variants of clinical importance, and show that 2% of this population has a medically actionable variant. Furthermore, we characterize the penetrance of cancer in carriers of pathogenic BRCA1 and BRCA2 variants. Exome sequences from the first 49,960 participants highlight the promise of genome sequencing in large population-based studies and are now accessible to the scientific community.
SUMMARYThe UK Biobank is a prospective study of 502,543 individuals, combining extensive phenotypic and genotypic data with streamlined access for researchers around the world. Here we describe the first tranche of large-scale exome sequence data for 49,960 study participants, revealing approximately 4 million coding variants (of which ~98.4% have frequency < 1%). The data includes 231,631 predicted loss of function variants, a >10-fold increase compared to imputed sequence for the same participants. Nearly all genes (>97%) had ≥1 predicted loss of function carrier, and most genes (>69%) had ≥10 loss of function carriers. We illustrate the power of characterizing loss of function variation in this large population through association analyses across 1,741 phenotypes. In addition to replicating a range of established associations, we discover novel loss of function variants with large effects on disease traits, including PIEZO1 on varicose veins, COL6A1 on corneal resistance, MEPE on bone density, and IQGAP2 and GMPR on blood cell traits. We further demonstrate the value of exome sequencing by surveying the prevalence of pathogenic variants of clinical significance in this population, finding that 2% of the population has a medically actionable variant. Additionally, we leverage the phenotypic data to characterize the relationship between rare BRCA1 and BRCA2 pathogenic variants and cancer risk. Exomes from the first 49,960 participants are now made accessible to the scientific community and highlight the promise offered by genomic sequencing in large-scale population-based studies.
Genome-wide association analysis of cohorts with thousands of phenotypes is computationally expensive, particularly when accounting for sample relatedness or population structure. Here we present a novel machine learning method called REGENIE for fitting a whole genome regression model that is orders of magnitude faster than alternatives, while maintaining statistical efficiency.The method naturally accommodates parallel analysis of multiple phenotypes, and only requires local segments of the genotype matrix to be loaded in memory, in contrast to existing alternatives which must load genomewide matrices into memory. This results in substantial savings in compute time and memory usage. The method is applicable to both quantitative and binary phenotypes, including rare variant analysis of binary traits with unbalanced case-control ratios where we introduce a fast, approximate Firth logistic regression test. The method is ideally suited to take
M Gorski et al.: Rapid kidney function decline c l i n i c a l i n v e s t i g a t i o n
Sequencing of large cohorts offers an unprecedented opportunity to identify rare genetic variants and to find novel contributors to human disease. We used gene-based collapsing tests to identify genes associated with glucose, HbA1c and type 2 diabetes (T2D) diagnosis in 379,066 exome-sequenced participants in the UK Biobank. We identified associations for variants in GCK, HNF1A and PDX1, which are known to be involved in Mendelian forms of diabetes. Notably, we uncovered novel associations for GIGYF1, a gene not previously implicated by human genetics in diabetes. GIGYF1 predicted loss of function (pLOF) variants associated with increased levels of glucose (0.77 mmol/L increase, p = 4.42 × 10–12) and HbA1c (4.33 mmol/mol, p = 1.28 × 10–14) as well as T2D diagnosis (OR = 4.15, p = 6.14 × 10–11). Multiple rare variants contributed to these associations, including singleton variants. GIGYF1 pLOF also associated with decreased cholesterol levels as well as an increased risk of hypothyroidism. The association of GIGYF1 pLOF with T2D diagnosis replicated in an independent cohort from the Geisinger Health System. In addition, a common variant association for glucose and T2D was identified at the GIGYF1 locus. Our results highlight the role of GIGYF1 in regulating insulin signaling and protecting from diabetes.
A major challenge in genetic association studies is that most associated variants fall in the non-coding part of the human genome. We searched for variants associated with bone mineral density (BMD) after enriching the discovery cohort for loss-of-function (LoF) mutations by sequencing a subset of the Nord-Trøndelag Health Study, followed by imputation in the remaining sample (N = 19,705), and identified ten known BMD loci. However, one previously unreported variant, LoF mutation in MEPE, p.(Lys70IlefsTer26, minor allele frequency [MAF] = 0.8%), was associated with decreased ultradistal forearm BMD (P-value = 2.1 × 10−18), and increased osteoporosis (P-value = 4.2 × 10−5) and fracture risk (P-value = 1.6 × 10−5). The MEPE LoF association with BMD and fractures was further evaluated in 279,435 UK (MAF = 0.05%, heel bone estimated BMD P-value = 1.2 × 10−16, any fracture P-value = 0.05) and 375,984 Icelandic samples (MAF = 0.03%, arm BMD P-value = 0.12, forearm fracture P-value = 0.005). Screening for the MEPE LoF mutations before adulthood could potentially prevent osteoporosis and fractures due to the lifelong effect on BMD observed in the study. A key implication for precision medicine is that high-impact functional variants missing from the publicly available cosmopolitan panels could be clinically more relevant than polygenic risk scores.
BACKGROUND: Mood disorders and strokes are often comorbid, and their health toll worldwide is huge. This study characterizes prognostic and causal roles of mood disorders in stroke. METHODS: We tested if genetic susceptibilities for mood disorders were associated with all strokes, ischemic strokes in the Malmö Diet and Cancer cohort (24 631 individuals with a median follow-up of 21.3 (interquartile range: 16.6–23.2) years. We further examined the causal effects for mood disorders on all strokes and ischemic strokes using summary statistics from large genome-wide association studies of mood disorders (up to 609 424 individuals, Psychiatric Genomics Consortium), all strokes and ischemic strokes (up to 446 696 individuals, MEGASTROKE Consortium). RESULTS: Among 24 366 stroke-free participants at baseline, 2632 individuals developed strokes, 2172 of them ischemic, during follow-up. After properly adjusting for well-known risk factors, participants in the highest quintile of polygenic risk scores for mood disorders had 1.45× (95% CI, 1.21–1.74) higher risk of strokes and 1.44× (95% CI, 1.18–1.76) higher risk of ischemic strokes compared with the lowest quintile in women. Mendelian randomization analyses suggested that mood disorders had a causal effect on strokes (odds ratio, 1.07 [95% CI, 1.03–1.11]) and ischemic strokes (odds ratio, 1.09 [95% CI, 1.04–1.13]). CONCLUSIONS: Our results suggest a causal role of mood disorders in the risk of stroke. High-risk women could be identified early in life using polygenic risk scores to ultimately prevent mood disorders and strokes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.