Clinical studies of non-communicable diseases identify multimorbidities that suggest a common set of predisposing factors. Despite the fact that humans have ~24,000 genes, we do not understand the genetic pathways that contribute to the development of multimorbid non-communicable disease. Here we create a multimorbidity atlas of traits based on pleiotropy of spatially regulated genes. Using chromatin interaction and expression Quantitative Trait Loci (eQTL) data, we analyse 20,782 variants (p < 5 × 10−6) associated with 1351 phenotypes to identify 16,248 putative spatial eQTL-eGene pairs that are involved in 76,013 short- and long-range regulatory interactions (FDR < 0.05) in different human tissues. Convex biclustering of spatial eGenes that are shared among phenotypes identifies complex interrelationships between nominally different phenotype-associated SNPs. Our approach enables the simultaneous elucidation of variant interactions with target genes that are drivers of multimorbidity, and those that contribute to unique phenotype associated characteristics.
Machine learning has shown utility in detecting patterns within large, unstructured, and complex datasets. One of the promising applications of machine learning is in precision medicine, where disease risk is predicted using patient genetic data. However, creating an accurate prediction model based on genotype data remains challenging due to the so-called “curse of dimensionality” (i.e., extensively larger number of features compared to the number of samples). Therefore, the generalizability of machine learning models benefits from feature selection, which aims to extract only the most “informative” features and remove noisy “non-informative,” irrelevant and redundant features. In this article, we provide a general overview of the different feature selection methods, their advantages, disadvantages, and use cases, focusing on the detection of relevant features (i.e., SNPs) for disease risk prediction.
The mechanisms that underlie the association between obesity and type 2 diabetes are not fully understood. Here, we investigated the role of the 3D genome organization in the pathogeneses of obesity and type-2 diabetes. We interpreted the combined and differential impacts of 196 diabetes and 390 obesity associated single nucleotide polymorphisms (SNPs) by integrating data on the genes with which they physically interact (as captured by Hi-C) and the functional [i.e., expression quantitative trait loci (eQTL)] outcomes associated with these interactions. We identified 861 spatially regulated genes (e.g., AP3S2, ELP5, SVIP, IRS1, FADS2, WFS1, RBM6, HORMAD1, PYROXD2), which are enriched in tissues (e.g., adipose, skeletal muscle, pancreas) and biological processes and canonical pathways (e.g., lipid metabolism, leptin, and glucose-insulin signaling pathways) that are important for the pathogenesis of type 2 diabetes and obesity. Our discovery-based approach also identifies enrichment for eQTL SNP-gene interactions in tissues that are not classically associated with diabetes or obesity. We propose that the combinatorial action of active obesity and diabetes spatial eQTL SNPs on their gene pairs within different tissues reduces the ability of these tissues to contribute to the maintenance of a healthy energy metabolism.
A BS TRACT: Background: GBA mutations are numerically the most significant genetic risk factor for Parkinson's disease (PD), yet these mutations have low penetrance, suggesting additional mechanisms. Objectives: The objective of this study was to determine if the penetrance of GBA in PD can be explained by regulatory effects on GBA and modifier genes. Methods: Genetic variants associated with the regulation of GBA were identified by screening 128 common single nucleotide polymorphisms (SNPs) in the GBA locus for spatial cis-expression quantitative trail locus (supported by chromatin interactions). Results: We identified common noncoding SNPs within GBA that (1) regulate GBA expression in peripheral tissues, some of which display α-synuclein pathology and (2) coregulate potential modifier genes in the central nervous system and/or peripheral tissues. Haplotypes based on 3 of these SNPs delay disease onset by 5 years. In addition, SNPs on 6 separate chromosomes coregulate GBA expression specifically in either the substantia nigra or cortex, and their combined effect potentially modulates motor and cognitive symptoms, respectively. Conclusions: This work provides a new perspective on the haplotype-specific effects of GBA and the genetic etiology of PD, expanding the role of GBA from the gene encoding the β-glucocerebrosidase (GCase) to that of a central regulator and modifier of PD onset, with GBA expression itself subject to distant regulation. Some idiopathic patients might possess insufficient GBA-encoded GCase activity in the substantia nigra as the result of distant regulatory variants and therefore might benefit from GBA-targeting therapeutics. The SNPs' regulatory impacts provide a plausible explanation for the variable phenotypes also observed in GBA-centric Gaucher's disease and dementia with Lewy bodies.
Genetic variation in the genomic regulatory landscape likely plays a crucial role in the pathology of disease. Non-coding variants associated with disease can influence the expression of long intergenic non-coding RNAs (lincRNAs), which in turn function in the control of protein-coding gene expression. Here, we investigate the function of two independent serum urate-associated signals (SUA1 and SUA2) in close proximity to lincRNAs and an enhancer that reside ∼60 kb and ∼300 kb upstream of MAF, respectively. Variants within SUA1 are expression quantitative trait loci (eQTL) for LINC01229 and MAFTRR, both co-expressed with MAF. We have also identified that variants within SUA1 are trans-eQTL for genes that are active in kidney- and serum urate-relevant pathways. Serum urate-associated variants rs4077450 and rs4077451 within SUA2 lie within an enhancer that recruits the transcription factor HNF4α and forms long range interactions with LINC01229 and MAFTRR. The urate-raising alleles of rs4077450 and rs4077451 increase enhancer activity and associate with increased expression of LINC01229. We show that the SUA2 enhancer region drives expression in the zebrafish pronephros, recapitulating endogenous MAF expression. Depletion of MAFTRR and LINC01229 in HEK293 cells in turn lead to increased MAF expression. Collectively, our results are consistent with serum urate variants mediating long-range transcriptional regulation of the lincRNAs LINC01229 and MAFTRR and urate relevant genes (e.g., SLC5A8 and EHHADH) in trans.
Clinical studies of non-communicable diseases identify multimorbidities that reflect our relatively limited fixed metabolic capacity. Despite the fact that we have ∼24000 genes, we do not understand the genetic pathways that contribute to the development of multimorbid non-communicable disease. We created a “multimorbidity atlas” of traits based on pleiotropy of spatially regulated genes using convex biclustering. Using chromatin interaction and expression Quantitative Trait Loci (eQTL) data, we analysed 20,782 variants (p < 5 × 10−6) associated with 1,351 phenotypes, to identify 16,248 putative eQTL-eGene pairs that are involved in 76,013 short- and long-range regulatory interactions (FDR < 0.05) in different human tissues. Convex biclustering of eGenes that are shared between phenotypes identified complex inter-relationships between nominally different phenotype associated SNPs. Notably, the loci at the centre of these inter-relationships were subject to complex tissue and disease specific regulatory effects. The largest cluster, 40 phenotypes that are related to fat and lipid metabolism, inflammatory disorders, and cancers, is centred on the FADS1-FADS3 locus (chromosome 11). Our novel approach enables the simultaneous elucidation of variant interactions with genes that are drivers of multimorbidity and those that contribute to unique phenotype associated characteristics.
69Serum urate is the end-product of purine metabolism. Elevated serum urate is causal of 70 gout and a predictor of renal disease, cardiovascular disease and other metabolic 71 conditions. Genome-wide association studies (GWAS) have reported dozens of loci 72 associated with serum urate control, however there has been little progress in 73 understanding the molecular basis of the associated loci. Here we employed trans-74 ancestral meta-analysis using data from European and East Asian populations to 75 identify ten new loci for serum urate levels. Genome-wide colocalization with cis-76 expression quantitative trait loci (eQTL) identified a further five new loci. By cis-and 77 trans-eQTL colocalization analysis we identified 24 and 20 genes respectively where 78 the causal eQTL variant has a high likelihood that it is shared with the serum urate-79 associated locus. One new locus identified was SLC22A9 that encodes organic anion 80 transporter 7 (OAT7). We demonstrate that OAT7 is a very weak urate-butyrate 81 exchanger. Newly implicated genes identified in the eQTL analysis include those 82 encoding proteins that make up the dystrophin complex, a scaffold for signaling 83 proteins and transporters at the cell membrane; MLXIP that, with the previously 84 identified MLXIPL, is a transcription factor that may regulate serum urate via the 85 pentose-phosphate pathway; and MRPS7 and IDH2 that encode proteins necessary for 86 mitochondrial function. Trans-ancestral functional fine-mapping identified six loci 87 (RREB1, INHBC, HLF, UBE2Q2, SFMBT1, HNF4G) with colocalized eQTL that 88 contained putative causal SNPs (posterior probability of causality > 0.8). This 89 systematic analysis of serum urate GWAS loci has identified candidate causal genes at 90 19 loci and a network of previously unidentified genes likely involved in control of 91 serum urate levels, further illuminating the molecular mechanisms of urate control. 92 93 Author Summary 94 High serum urate is a prerequisite for gout and a risk factor for metabolic disease. 95Previous GWAS have identified numerous loci that are associated with serum urate 96 control, however, only a small handful of these loci have known molecular 97 consequences. The majority of loci are within the non-coding regions of the genome 98 and therefore it is difficult to ascertain how these variants might influence serum urate 99 levels without tangible links to gene expression and / or protein function. We have 100 applied a novel bioinformatic pipeline where we combined population-specific GWAS 101 4 data with gene expression and genome connectivity information to identify putative 102 causal genes for serum urate associated loci. Overall, we identified 15 novel serum 103 urate loci and show that these loci along with previously identified loci are linked to 104 the expression of 44 genes. We show that some of the variants within these loci have 105 strong predicted regulatory function which can be further tested in functional analyses. 106 This study expands on previous GWAS by ident...
High serum urate is a prerequisite for gout and associated with metabolic disease. Genome-wide association studies (GWAS) have reported dozens of loci associated with serum urate control; however, there has been little progress in understanding the molecular basis of the associated loci. Here, we employed trans-ancestral meta-analysis using data from European and East Asian populations to identify 10 new loci for serum urate levels. Genome-wide colocalization with cis-expression quantitative trait loci (eQTL) identified a further five new candidate loci. By cis- and trans-eQTL colocalization analysis, we identified 34 and 20 genes, respectively, where the causal eQTL variant has a high likelihood that it is shared with the serum urate-associated locus. One new locus identified was SLC22A9 that encodes organic anion transporter 7 (OAT7). We demonstrate that OAT7 is a very weak urate-butyrate exchanger. Newly implicated genes identified in the eQTL analysis include those encoding proteins that make up the dystrophin complex, a scaffold for signaling proteins and transporters at the cell membrane; MLXIP that, with the previously identified MLXIPL, is a transcription factor that may regulate serum urate via the pentose–phosphate pathway and MRPS7 and IDH2 that encode proteins necessary for mitochondrial function. Functional fine mapping identified six loci (RREB1, INHBC, HLF, UBE2Q2, SFMBT1 and HNF4G) with colocalized eQTL containing putative causal SNPs. This systematic analysis of serum urate GWAS loci identified candidate causal genes at 24 loci and a network of previously unidentified genes likely involved in control of serum urate levels, further illuminating the molecular mechanisms of urate control.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.