DNA sequence variation has been associated with quantitative changes in molecular phenotypes such as gene expression, but its impact on chromatin states is poorly characterized. To understand the interplay between chromatin and genetic control of gene regulation we quantified allelic variability in transcription factor binding, histone modifications, and gene expression within humans. We found abundant allelic specificity in chromatin and extensive local, short-, and long-range allelic coordination among the studied molecular phenotypes. We observed genetic influence on most of these phenotypes, with histone modifications exhibiting strong context-dependent behavior. Our results implicate transcription factors as primary mediators of sequence-specific regulation of gene expression programs, with histone modifications frequently reflecting the primary regulatory event.
Chromatin state variation at gene regulatory elements is abundant across individuals, yet we understand little about the genetic basis of this variability. Here, we profiled several histone modifications, the transcription factor (TF) PU.1, RNA polymerase II, and gene expression in lymphoblastoid cell lines from 47 whole-genome sequenced individuals. We observed that distinct cis-regulatory elements exhibit coordinated chromatin variation across individuals in the form of variable chromatin modules (VCMs) at sub-Mb scale. VCMs were associated with thousands of genes and preferentially cluster within chromosomal contact domains. We mapped strong proximal and weak, yet more ubiquitous, distal-acting chromatin quantitative trait loci (cQTL) that frequently explain this variation. cQTLs were associated with molecular activity at clusters of cis-regulatory elements and mapped preferentially within TF-bound regions. We propose that local, sequence-independent chromatin variation emerges as a result of genetic perturbations in cooperative interactions between cis-regulatory elements that are located within the same genomic domain.
Population scale studies combining genetic information with molecular phenotypes (for example, gene expression) have become a standard to dissect the effects of genetic variants onto organismal phenotypes. These kinds of data sets require powerful, fast and versatile methods able to discover molecular Quantitative Trait Loci (molQTL). Here we propose such a solution, QTLtools, a modular framework that contains multiple new and well-established methods to prepare the data, to discover proximal and distal molQTLs and, finally, to integrate them with GWAS variants and functional annotations of the genome. We demonstrate its utility by performing a complete expression QTL study in a few easy-to-perform steps. QTLtools is open source and available at https://qtltools.github.io/qtltools/.
Most signals detected by genome-wide association studies map to non-coding sequence and their tissue-specific effects influence transcriptional regulation. However, key tissues and cell-types required for functional inference are absent from large-scale resources. Here we explore the relationship between genetic variants influencing predisposition to type 2 diabetes (T2D) and related glycemic traits, and human pancreatic islet transcription using data from 420 donors. We find: (a) 7741 cis-eQTLs in islets with a replication rate across 44 GTEx tissues between 40% and 73%; (b) marked overlap between islet cis-eQTL signals and active regulatory sequences in islets, with reduced eQTL effect size observed in the stretch enhancers most strongly implicated in GWAS signal location; (c) enrichment of islet cis-eQTL signals with T2D risk variants identified in genome-wide association studies; and (d) colocalization between 47 islet cis-eQTLs and variants influencing T2D or glycemic traits, including DGKB and TCF7L2. Our findings illustrate the advantages of performing functional and regulatory studies in disease relevant tissues.
How to interpret the biological causes underlying the predisposing markers identified through genome-wide association studies (GWAS) remains an open question. One direct and powerful way to assess the genetic causality behind GWAS is through analysis of expression quantitative trait loci (eQTLs). Here we describe a new approach to estimate the tissues behind the genetic causality of a variety of GWAS traits, using the cis-eQTLs in 44 tissues from the Genotype-Tissue Expression (GTEx) Consortium. We have adapted the regulatory trait concordance (RTC) score to measure the probability of eQTLs being active in multiple tissues and to calculate the probability that a GWAS-associated variant and an eQTL tag the same functional effect. By normalizing the GWAS-eQTL probabilities by the tissue-sharing estimates for eQTLs, we generate relative tissue-causality profiles for GWAS traits. Our approach not only implicates the gene likely mediating individual GWAS signals, but also highlights tissues where the genetic causality for an individual trait is likely manifested.
Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues.
7Interpretation of biological causes of the predisposing markers identified through Genome Wide 8 Association Studies (GWAS) remains an open question 1 . One direct and powerful way to assess the 9 genetic causality behind GWAS is through expression quantitative trait loci (eQTLs) 2 . Here we 10 describe a novel approach to estimate the tissues giving rise to the genetic causality behind a wide 11 variety of GWAS traits, using the cis-eQTLs identified in 44 tissues of the GTEx consortium 3,4 . We 12 have adapted the Regulatory Trait Concordance (RTC) score 5 , to on the one hand measure the 13 tissue sharing probabilities of eQTLs, and also to calculate the probability that a GWAS and an 14 eQTL variant tag the same underlying functional effect. We show that our tissue sharing estimates 15 significantly correlate with commonly used estimates of tissue sharing. By normalizing the GWAS-16 eQTL probabilities with the tissue sharing estimates of the eQTLs, we can estimate the tissues 17 from which GWAS genetic causality arises. Our approach not only indicates the gene mediating 18 individual GWAS signals, but also can highlight tissues where the genetic causality for an individual 19 trait is manifested. 20Over the last decade, Genome Wide Association Studies (GWAS) have become the norm in 21 describing genetic variants associated with common complex human diseases and traits 1,6 . Although 22
ObjectivesSystemic lupus erythematosus (SLE) diagnosis and treatment remain empirical and the molecular basis for its heterogeneity elusive. We explored the genomic basis for disease susceptibility and severity.MethodsmRNA sequencing and genotyping in blood from 142 patients with SLE and 58 healthy volunteers. Abundances of cell types were assessed by CIBERSORT and cell-specific effects by interaction terms in linear models. Differentially expressed genes (DEGs) were used to train classifiers (linear discriminant analysis) of SLE versus healthy individuals in 80% of the dataset and were validated in the remaining 20% running 1000 iterations. Transcriptome/genotypes were integrated by expression-quantitative trail loci (eQTL) analysis; tissue-specific genetic causality was assessed by regulatory trait concordance (RTC).ResultsSLE has a ‘susceptibility signature’ present in patients in clinical remission, an ‘activity signature’ linked to genes that regulate immune cell metabolism, protein synthesis and proliferation, and a ‘severity signature’ best illustrated in active nephritis, enriched in druggable granulocyte and plasmablast/plasma–cell pathways. Patients with SLE have also perturbed mRNA splicing enriched in immune system and interferon signalling genes. A novel transcriptome index distinguished active versus inactive disease—but not low disease activity—and correlated with disease severity. DEGs discriminate SLE versus healthy individuals with median sensitivity 86% and specificity 92% suggesting a potential use in diagnostics. Combined eQTL analysis from the Genotype Tissue Expression (GTEx) project and SLE-associated genetic polymorphisms demonstrates that susceptibility variants may regulate gene expression in the blood but also in other tissues.ConclusionSpecific gene networks confer susceptibility to SLE, activity and severity, and may facilitate personalised care.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.