Despite the dramatic underrepresentation of non-European populations in human genetics studies, researchers continue to exclude participants of non-European ancestry, as well as variants rare in European populations, even when these data are available. This practice perpetuates existing research disparities and can lead to important and large effect size associations being missed. Here, we conducted genome-wide association studies (GWAS) of 31 serum and urine biomarker quantitative traits in African (n=9354), East Asian (n=2559) and South Asian (n=9823) ancestry UK Biobank (UKBB) participants. We adjusted for all known GWAS catalog variants for each trait, as well as novel signals identified in a recent European ancestry focused analysis of UKBB participants. We identify 7 novel signals in African ancestry and 2 novel signals in South Asian ancestry participants (p < 1.61E-10). Many of these signals are highly plausible, including a cis pQTL for the gene encoding gamma-glutamyl transferase and PIEZO1 and G6PD variants with impacts on HbA1c through likely erythrocytic mechanisms. This work illustrates the importance of using the genetic data we already have in diverse populations, with novel discoveries possible in even modest sample sizes.
Polygenic risk scores (PRS) have shown successes in clinics, but most PRS methods have focused only on individuals with one primary continental ancestry, thus poorly accommodating recently-admixed individuals. Here, we develop GAUDI, a novel penalized-regression-based method specifically designed for admixed individuals by explicitly modeling ancestry-specific effects and jointly estimating ancestry-shared effects. We demonstrate marked advantages of GAUDI over other methods through comprehensive simulation and real data analyses.
Despite the dramatic underrepresentation of non-European populations in human genetics studies, researchers continue to exclude participants of non-European ancestry, even when these data are available. This practice perpetuates existing research disparities and can lead to important and large effect size associations being missed. Here, we conducted genome-wide association studies (GWAS) of 31 serum and urine biomarker quantitative traits in African (n=9354), East Asian (n=2559) and South Asian (n=9823) UK Biobank participants ancestry. We adjusted for all known GWAS catalog variants for each trait, as well as novel signals identified in European ancestry UK Biobank participants alone. We identify 12 novel signals in African ancestry and 3 novel signals in South Asian participants (p<1.61 × 10−10). Many of these signals are highly plausible and rare in Europeans (1% or lower minor allele frequency), including cis pQTLs for the genes encoding serum biomarkers like gamma-glutamyl transferase and apolipoprotein A, PIEZ01 and G6PD variants with impacts on HbA1c through likely erythocytic mechanisms, and a coding variant in GPLD1, a gene which cleaves GPI-anchors, associated with normally GPI-anchored protein alkaline phosphatase in serum. This work illustrates the importance of using the genetic data we already have in diverse populations, with many novel discoveries possible in even modest sample sizes.
Previous genome-wide association studies (GWAS) of hematological traits have identified over 10 000 distinct trait-specific risk loci. However, at these loci, the underlying causal mechanisms remain incompletely characterized. To elucidate novel biology and better understand causal mechanisms at known loci, we performed a transcriptome-wide association study (TWAS) of 29 hematological traits in 399 835 UK Biobank (UKB) participants of European ancestry using gene expression prediction models trained from whole blood RNA-seq data in 922 individuals. We discovered 557 gene-trait associations for hematological traits distinct from previously reported GWAS variants in European populations. Among the 557 associations, 301 were available for replication in a cohort of 141 286 participants of European ancestry from the Million Veteran Program (MVP). Of these 301 associations, 108 replicated at a strict Bonferroni adjusted threshold ($\alpha$ = 0.05/301). Using our TWAS results, we systematically assigned 4261 out of 16 900 previously identified hematological trait GWAS variants to putative target genes. Compared to coloc, our TWAS results show reduced specificity and increased sensitivity in external datasets to assign variants to target genes.
Hematological measures are important intermediate clinical phenotypes for many acute and chronic diseases and are highly heritable. Although genome‐wide association studies (GWAS) have identified thousands of loci containing trait‐associated variants, the causal genes underlying these associations are often uncertain. To better understand the underlying genetic regulatory mechanisms, we performed a transcriptome‐wide association study (TWAS) to systematically investigate the association between genetically predicted gene expression and hematological measures in 54,542 Europeans from the Genetic Epidemiology Research on Aging cohort. We found 239 significant gene‐trait associations with hematological measures; we replicated 71 associations at p < 0.05 in a TWAS meta‐analysis consisting of up to 35,900 Europeans from the Women's Health Initiative, Atherosclerosis Risk in Communities Study, and BioMe Biobank. Additionally, we attempted to refine this list of candidate genes by performing conditional analyses, adjusting for individual variants previously associated with hematological measures, and performed further fine‐mapping of TWAS loci. To facilitate interpretation of our findings, we designed an R Shiny application to interactively visualize our TWAS results by integrating them with additional genetic data sources (GWAS, TWAS from multiple reference panels, conditional analyses, known GWAS variants, etc.). Our results and application highlight frequently overlooked TWAS challenges and illustrate the complexity of TWAS fine‐mapping.
Hi-C data provide population averaged estimates of three-dimensional chromatin contacts across cell types and states in bulk samples. Effective analysis of Hi-C data entails controlling for the potential confounding factor of differential cell type proportions across heterogeneous bulk samples. We propose a novel unsupervised deconvolution method for inferring cell type composition from bulk Hi-C data, the Two-step Hi-c UNsupervised DEconvolution appRoach (THUNDER). We conducted extensive simulations to test THUNDER based on combining two published single-cell Hi-C (scHi-C) datasets. THUNDER more accurately estimates the underlying cell type proportions compared to reference-free methods (e.g., TOAST, and NMF) and is more robust than reference-dependent methods (e.g. MuSiC). We further demonstrate the practical utility of THUNDER to estimate cell type proportions and identify cell-type-specific interactions in Hi-C data from adult human cortex tissue samples. THUNDER will be a useful tool in adjusting for varying cell type composition in population samples, facilitating valid and more powerful downstream analysis such as differential chromatin organization studies. Additionally, THUNDER estimated contact profiles provide a useful exploratory framework to investigate cell-type-specificity of the chromatin interactome while experimental data is still rare.
Background: Thousands of genetic variants have been associated with hematological traits, though target genes remain unknown at most loci. Moreover, limited analyses have been conducted in African ancestry and Hispanic/Latino populations; hematological trait associated variants more common in these populations have likely been missed. Methods: To derive gene expression prediction models, we used ancestry-stratified datasets from the Multi-Ethnic Study of Atherosclerosis (MESA, including n = 229 African American and n = 381 Hispanic/Latino participants, monocytes) and the Depression Genes and Networks study (DGN, n = 922 European ancestry participants, whole blood). We then performed a transcriptome-wide association study (TWAS) for platelet count, hemoglobin, hematocrit, and white blood cell count in African (n = 27,955) and Hispanic/Latino (n = 28,324) ancestry participants. Results: Our results revealed 24 suggestive signals (p < 1 × 10−4) that were conditionally distinct from known GWAS identified variants and successfully replicated these signals in European ancestry subjects from UK Biobank. We found modestly improved correlation of predicted and measured gene expression in an independent African American cohort (the Genetic Epidemiology Network of Arteriopathy (GENOA) study (n = 802), lymphoblastoid cell lines) using the larger DGN reference panel; however, some genes were well predicted using MESA but not DGN. Conclusions: These analyses demonstrate the importance of performing TWAS and other genetic analyses across diverse populations and of balancing sample size and ancestry background matching when selecting a TWAS reference panel.
Hematological measures are important intermediate clinical phenotypes for many acute and chronic diseases. Hematological measures are highly heritable, and although genome-wide association studies (GWAS) have identified thousands of loci containing trait-associated variants, the causal genes underlying these associations are often uncertain. To better understand the underlying genetic regulatory mechanisms, we performed a transcriptome-wide association study (TWAS) using PrediXcan to systematically investigate the association between genetically-predicted gene expression and hematological measures in 54,542 individuals of European ancestry from the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. We found 239 significant gene-trait associations with hematological measures. Among this set of 239 associations, we replicated 71 at p < 0.05 with same direction of effect for the blood cell trait in a meta-analysis of TWAS results consisting of up to 35,900 European ancestry individuals from the Women’s Health Initiative (WHI), the Atherosclerosis Risk in Communities Study (ARIC), and BioMe Biobank. We further attempted to refine this list of candidate genes by performing conditional analyses, adjusting for individual variants previously associated with these hematological measures, and performed further fine-mapping of TWAS loci. To assist with the interpretation of TWAS findings, we designed an R Shiny application to interactively visualize TWAS results, one genomic locus at a time, by integrating our TWAS results with additional genetic data sources (GWAS, TWAS from other gene expression reference panels, conditional analyses, known GWAS variants, etc.). Our results and R Shiny application highlight frequently overlooked challenges with TWAS and illustrate the complexity of TWAS fine-mapping efforts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.