Large-scale cis-and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression
SummaryWhile many disease-associated variants have been identified through genome-wide association studies, their downstream molecular consequences remain unclear.To identify these effects, we performed cis- and trans-expression quantitative trait locus (eQTL) analysis in blood from 31,684 individuals through the eQTLGen Consortium.We observed that cis-eQTLs can be detected for 88% of the studied genes, but that they have a different genetic architecture compared to disease-associated variants, limiting our ability to use cis-eQTLs to pinpoint causal genes within susceptibility loci.In contrast, trans-eQTLs (detected for 37% of 10,317 studied trait-associated variants) were more informative. Multiple unlinked variants, associated to the same complex trait, often converged on trans-genes that are known to play central roles in disease etiology.We observed the same when ascertaining the effect of polygenic scores calculated for 1,263 genome-wide association study (GWAS) traits. Expression levels of 13% of the studied genes correlated with polygenic scores, and many resulting genes are known to drive these traits.
Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored1,2. FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10–11) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.
Summary Height is a highly heritable, classic polygenic trait with ∼700 common associated variants identified so far through genome-wide association studies. Here, we report 83 height-associated coding variants with lower minor allele frequencies (range of 0.1-4.8%) and effects of up to 2 cm/allele (e.g. in IHH, STC2, AR and CRISPLD2), >10 times the average effect of common variants. In functional follow-up studies, rare height-increasing alleles of STC2 (+1-2 cm/allele) compromised proteolytic inhibition of PAPP-A and increased cleavage of IGFBP-4 in vitro, resulting in higher bioavailability of insulin-like growth factors. These 83 height-associated variants overlap genes mutated in monogenic growth disorders and highlight new biological candidates (e.g. ADAMTS3, IL11RA, NOX4) and pathways (e.g. proteoglycan/glycosaminoglycan synthesis) involved in growth. Our results demonstrate that sufficiently large sample sizes can uncover rare and low-frequency variants of moderate to large effect associated with polygenic human phenotypes, and that these variants implicate relevant genes and pathways.
Genome-wide association studies (GWAS) have identified more than 100 genetic variants contributing to BMI, a measure of body size, or waist-to-hip ratio (adjusted for BMI, WHRadjBMI), a measure of body shape. Body size and shape change as people grow older and these changes differ substantially between men and women. To systematically screen for age- and/or sex-specific effects of genetic variants on BMI and WHRadjBMI, we performed meta-analyses of 114 studies (up to 320,485 individuals of European descent) with genome-wide chip and/or Metabochip data by the Genetic Investigation of Anthropometric Traits (GIANT) Consortium. Each study tested the association of up to ~2.8M SNPs with BMI and WHRadjBMI in four strata (men ≤50y, men >50y, women ≤50y, women >50y) and summary statistics were combined in stratum-specific meta-analyses. We then screened for variants that showed age-specific effects (G x AGE), sex-specific effects (G x SEX) or age-specific effects that differed between men and women (G x AGE x SEX). For BMI, we identified 15 loci (11 previously established for main effects, four novel) that showed significant (FDR<5%) age-specific effects, of which 11 had larger effects in younger (<50y) than in older adults (≥50y). No sex-dependent effects were identified for BMI. For WHRadjBMI, we identified 44 loci (27 previously established for main effects, 17 novel) with sex-specific effects, of which 28 showed larger effects in women than in men, five showed larger effects in men than in women, and 11 showed opposite effects between sexes. No age-dependent effects were identified for WHRadjBMI. This is the first genome-wide interaction meta-analysis to report convincing evidence of age-dependent genetic effects on BMI. In addition, we confirm the sex-specificity of genetic effects on WHRadjBMI. These results may provide further insights into the biology that underlies weight change with age or the sexually dimorphism of body shape.
Population isolates such as Finland provide benefits in genetic studies because the allelic spectrum of damaging alleles in any gene is often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%), which survived the founding bottleneck, as opposed to being distributed over a much larger number of ultra--rare variants. While this advantage is well-- established in Mendelian genetics, its value in common disease genetics has been less explored. FinnGen aims to study the genome and national health register data of 500,000 Finns, already reaching 224,737 genotyped and phenotyped participants. Given the relatively high median age of participants (63 years) and dominance of hospital-based recruitment, FinnGen is enriched for many disease endpoints often underrepresented in population-based studies (e.g., rarer immune-mediated diseases and late onset degenerative and ophthalmologic endpoints). We report here a genome-wide association study (GWAS) of 1,932 clinical endpoints defined from nationwide health registries. We identify genome--wide significant associations at 2,491 independent loci. Among these, finemapping implicates 148 putatively causal coding variants associated with 202 endpoints, 104 with low allele frequency (AF<10%) of which 62 were over two-fold enriched in Finland.We studied a benchmark set of 15 diseases that had previously been investigated in large genome-wide association studies. FinnGen discovery analyses were meta-analysed in Estonian and UK biobanks. We identify 30 novel associations, primarily low-frequency variants strongly enriched, in or specific to, the Finnish population and Uralic language family neighbors in Estonia and Russia.These findings demonstrate the power of bottlenecked populations to find unique entry points into the biology of common diseases through low-frequency, high impact variants. Such high impact variants have a potential to contribute to medical translation including drug discovery.
Genome-wide association studies (GWAS) have identified thousands of variants associated with complex traits, but their biological interpretation often remains unclear. Most of these variants overlap with expression QTLs, indicating their potential involvement in regulation of gene expression. Here, we propose a transcriptome-wide summary statistics-based Mendelian Randomization approach (TWMR) that uses multiple SNPs as instruments and multiple gene expression traits as exposures, simultaneously. Applied to 43 human phenotypes, it uncovers 3,913 putatively causal gene–trait associations, 36% of which have no genome-wide significant SNP nearby in previous GWAS. Using independent association summary statistics, we find that the majority of these loci were missed by GWAS due to power issues. Noteworthy among these links is educational attainment-associated BSCL2 , known to carry mutations leading to a Mendelian form of encephalopathy. We also find pleiotropic causal effects suggestive of mechanistic connections. TWMR better accounts for pleiotropy and has the potential to identify biological mechanisms underlying complex traits.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.