Genome-wide association studies (GWAS) have identified thousands of variants associated with complex traits, but their biological interpretation often remains unclear. Most of these variants overlap with expression QTLs, indicating their potential involvement in regulation of gene expression. Here, we propose a transcriptome-wide summary statistics-based Mendelian Randomization approach (TWMR) that uses multiple SNPs as instruments and multiple gene expression traits as exposures, simultaneously. Applied to 43 human phenotypes, it uncovers 3,913 putatively causal gene–trait associations, 36% of which have no genome-wide significant SNP nearby in previous GWAS. Using independent association summary statistics, we find that the majority of these loci were missed by GWAS due to power issues. Noteworthy among these links is educational attainment-associated BSCL2 , known to carry mutations leading to a Mendelian form of encephalopathy. We also find pleiotropic causal effects suggestive of mechanistic connections. TWMR better accounts for pleiotropy and has the potential to identify biological mechanisms underlying complex traits.
Previous genome-wide association studies (GWASs) of stroke — the second leading cause of death worldwide — were conducted predominantly in populations of European ancestry1,2. Here, in cross-ancestry GWAS meta-analyses of 110,182 patients who have had a stroke (five ancestries, 33% non-European) and 1,503,898 control individuals, we identify association signals for stroke and its subtypes at 89 (61 new) independent loci: 60 in primary inverse-variance-weighted analyses and 29 in secondary meta-regression and multitrait analyses. On the basis of internal cross-ancestry validation and an independent follow-up in 89,084 additional cases of stroke (30% non-European) and 1,013,843 control individuals, 87% of the primary stroke risk loci and 60% of the secondary stroke risk loci were replicated (P < 0.05). Effect sizes were highly correlated across ancestries. Cross-ancestry fine-mapping, in silico mutagenesis analysis3, and transcriptome-wide and proteome-wide association analyses revealed putative causal genes (such as SH3PXD2A and FURIN) and variants (such as at GRK5 and NOS3). Using a three-pronged approach4, we provide genetic evidence for putative drug effects, highlighting F11, KLKB1, PROC, GP1BA, LAMC2 and VCAM1 as possible targets, with drugs already under investigation for stroke for F11 and PROC. A polygenic score integrating cross-ancestry and ancestry-specific stroke GWASs with vascular-risk factor GWASs (integrative polygenic scores) strongly predicted ischaemic stroke in populations of European, East Asian and African ancestry5. Stroke genetic risk scores were predictive of ischaemic stroke independent of clinical risk factors in 52,600 clinical-trial participants with cardiometabolic disease. Our results provide insights to inform biology, reveal potential drug targets and derive genetic risk prediction tools across ancestries.
Genome-wide association studies (GWAS) identified thousands of variants associated with complex traits, but their biological interpretation often remains unclear. Most of these variants overlap with expression QTLs (eQTLs), indicating their potential involvement in the regulation of gene expression. Here, we propose an advanced summary statistics-based Mendelian Randomization approach that uses multiple SNPs jointly as instruments and multiple gene expression traits as exposures, simultaneously. When applied to 43 human phenotypes it uncovered 2,277 putative genes whose blood expression is causally associated with at least one phenotype resulting in 5,009 gene-trait associations; of note, 55% of them had no genome-wide significant SNP nearby in previous GWAS analysis. Using independent association summary statistics (UKBiobank), we confirmed that the majority of these loci were missed by conventional GWAS due to power issues. Noteworthy among these novel links are height-and intelligence-associated PEX19 and CDC42, respectively known to carry mutations leading to short stature and Takenouchi-Kosaki syndrome. We similarly unraveled novel pleiotropic causal effects suggestive of mechanistic connections, e.g. the shared genetic effects of TSPAN14 in rheumatoid arthritis, Crohn's and inflammatory bowel disease. Finally, we show that causal genes can be highly tissuespecific. Our advanced Mendelian Randomization unlocks hidden value from published GWAS through higher power in detecting associations. It better accounts for pleiotropy and unravels new biological mechanisms underlying complex and clinical traits.
Comparing transcript levels between healthy and diseased individuals allows the identification of differentially expressed genes, which may be causes, consequences or mere correlates of the disease under scrutiny. We propose a method to decompose the observational correlation between gene expression and phenotypes driven by confounders, forward- and reverse causal effects. The bi-directional causal effects between gene expression and complex traits are obtained by Mendelian Randomization integrating summary-level data from GWAS and whole-blood eQTLs. Applying this approach to complex traits reveals that forward effects have negligible contribution. For example, BMI- and triglycerides-gene expression correlation coefficients robustly correlate with trait-to-expression causal effects (rBMI = 0.11, PBMI = 2.0 × 10−51 and rTG = 0.13, PTG = 1.1 × 10−68), but not detectably with expression-to-trait effects. Our results demonstrate that studies comparing the transcriptome of diseased and healthy subjects are more prone to reveal disease-induced gene expression changes rather than disease causing ones.
Elevated C-reactive protein (CRP) concentrations in the blood are associated with acute and chronic infections and inflammation. Nevertheless, the functional role of increased CRP in multiple bacterial and viral infections as well as in chronic inflammatory diseases remains unclear. Here, we studied the relationship between CRP and gene expression levels in the blood in 491 individuals from the Estonian Biobank cohort, to elucidate the role of CRP in these inflammatory mechanisms. As a result, we identified a set of 1,614 genes associated with changes in CRP levels with a high proportion of interferon-stimulated genes. Further, we performed likelihood-based causality model selection and Mendelian randomization analysis to discover causal links between CRP and the expression of CRP-associated genes. Strikingly, our computational analysis and cell culture stimulation assays revealed increased CRP levels to drive the expression of complement regulatory protein CD59, suggesting CRP to have a critical role in protecting blood cells from the adverse effects of the immune defence system. Our results show the benefit of integrative analysis approaches in hypothesis-free uncovering of causal relationships between traits.
High-dimensional omics datasets provide valuable resources to determine the causal role of molecular traits in mediating the path from genotype to phenotype. Making use of molecular quantitative trait loci (QTL) and genome-wide association study (GWAS) summary statistics, we propose a multivariable Mendelian randomization (MVMR) framework to quantify the proportion of the impact of the DNA methylome (DNAm) on complex traits that is propagated through the assayed transcriptome. Evaluating 50 complex traits, we find that on average at least 28.3% (95% CI: [26.9%–29.8%]) of DNAm-to-trait effects are mediated through (typically multiple) transcripts in the cis-region. Several regulatory mechanisms are hypothesized, including methylation of the promoter probe cg10385390 (chr1:8’022’505) increasing the risk for inflammatory bowel disease by reducing PARK7 expression. The proposed integrative framework can be extended to other omics layers to identify causal molecular chains, providing a powerful tool to map and interpret GWAS signals.
Comparing transcript levels between healthy and diseased individuals allows the identification of differentially expressed genes, which may be causes, consequences or mere correlates of the disease under scrutiny. Here, we propose a bi-directional Transcriptome-Wide Mendelian Randomization (TWMR) approach that integrates summary-level data from GWAS and whole-blood eQTLs in a MR framework to investigate the causal effects between gene expression and complex traits. Whereas we have previously developed a TWMR approach to elucidate gene expression to trait causal effects, here we are adapting the method to shed light on the causal imprint of complex traits on transcript levels. We termed this new approach reverse TWMR (revTWMR). Integrating bi-directional causal effects between gene expression and complex traits enables to evaluate their respective contributions to the correlation between gene expression and traits. We uncovered that whole blood gene expression-trait correlation is mainly driven by causal effect from the phenotype on the expression rather than the reverse. For example, BMI- and triglycerides-gene expression correlation coefficients robustly correlate with trait-to-expression causal effects (r=0.09, P=1.54x10-39 and r=0.09, P=1.19x10-34, respectively), but not detectably with expression-to-trait effects. Genes implicated by revTWMR confirmed known associations, such as rheumathoid arthritis and Crohn's disease induced changes in expression of TRBV and GBP2, respectively. They also shed light on how clinical biomarkers can influence their own levels. For instance, we observed that high levels of high-density lipoprotein (HDL) cholesterol lowers the expression of genes involved in cholesterol biosynthesis (SQLE, FDFT1) and increases the expression of genes responsible for cholesterol efflux (ABCA1, ABCG1), two key molecular pathways in determining HDL levels. Importantly, revTWMR is more robust to pleiotropy than polygenic risk score (PRS) approaches which can be misled by pleiotropic outliers. As one example, revTWMR revealed that the previously reported association between educational attainment PRS and STX1B is exclusively driven by a highly pleiotropic SNP (rs2456973), which is strongly associated with several hematological and anthropometric traits. In conclusion, our method disentangles the relationship between gene expression and phenotypes and reveals that complex traits have more pronounced impact on gene expression than the reverse. We demonstrated that studies comparing the transcriptome of diseased and healthy subjects are more prone to reveal disease-induced gene expression changes rather than disease causing ones.
Major biotechnological advances have facilitated a tremendous boost to the collection of (gen-/transcript-/prote-/methyl-/metabol-)omics data in very large sample sizes worldwide. Coordinated efforts have yielded a deluge of studies associating diseases with genetic markers (genome-wide association studies) or with molecular phenotypes. Whereas omics-disease associations have led to biologically meaningful and coherent mechanisms, the identified (non-germline) disease biomarkers may simply be correlates or consequences of the explored diseases. To move beyond this realm, Mendelian randomization provides a principled framework to integrate information on omics-and disease-associated genetic variants to pinpoint molecular traits causally driving disease development. In this review, we show the latest advances in this field, flag up key challenges for the future, and propose potential solutions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.