Explaining the genetics of many diseases is challenging because most associations localize to incompletely characterized regulatory regions. We show that transcription factors (TFs) occupy multiple loci of individual complex genetic disorders using novel computational methods. Application to 213 phenotypes and 1,544 TF binding datasets identifies 2,264 relationships between hundreds of TFs and 94 phenotypes, including AR in prostate cancer and GATA3 in breast cancer. Strikingly, nearly half of the systemic lupus erythematosus risk loci are occupied by the Epstein-Barr virus EBNA2 protein and many co-clustering human TFs, revealing gene-environment interaction. Similar EBNA2-anchored associations exist in multiple sclerosis, rheumatoid arthritis, inflammatory bowel disease, type 1 diabetes, juvenile idiopathic arthritis, and celiac disease. Instances of allele-dependent DNA binding and downstream effects on gene expression at plausibly causal variants support genetic mechanisms dependent upon EBNA2. Our results nominate mechanisms that operate across risk loci within disease phenotypes, suggesting new paradigms for disease origins.
Systemic lupus erythematosus (SLE) is an autoimmune disease with marked gender and ethnic disparities. We report a large transancestral association study of SLE using Immunochip genotype data from 27,574 individuals of European (EA), African (AA) and Hispanic Amerindian (HA) ancestry. We identify 58 distinct non-HLA regions in EA, 9 in AA and 16 in HA (∼50% of these regions have multiple independent associations); these include 24 novel SLE regions (P<5 × 10−8), refined association signals in established regions, extended associations to additional ancestries, and a disentangled complex HLA multigenic effect. The risk allele count (genetic load) exhibits an accelerating pattern of SLE risk, leading us to posit a cumulative hit hypothesis for autoimmune disease. Comparing results across the three ancestries identifies both ancestry-dependent and ancestry-independent contributions to SLE risk. Our results are consistent with the unique and complex histories of the populations sampled, and collectively help clarify the genetic architecture and ethnic disparities in SLE.
Eosinophilic esophagitis (EoE) is a chronic inflammatory disorder associated with allergic hypersensitivity to food. We interrogated >1.5 million genetic variants in European EoE cases and subsequently in a multi-site cohort with local and out-of-study control subjects. In addition to replication of the 5q22 locus (meta-analysis p = 1.9×10−16), we identified association at 2p23 (encoding CAPN14, p = 2.5×10−10). CAPN14 was specifically expressed in the esophagus, dynamically upregulated as a function of disease activity and genetic haplotype and after exposure of epithelial cells to IL-13, and located in an epigenetic hotspot modified by IL-13. There was enriched esophageal expression for the genes neighboring the top 208 EoE sequence variants. Multiple allergic sensitization loci were associated with EoE susceptibility (4.8×10−2 < p < 5.1×10−11). We propose a model that elucidates the tissue specific nature of EoE that involves the interplay of allergic sensitization with an EoE-specific, IL-13–inducible esophageal response involving CAPN14.
BackgroundDespite evidence that genetic factors contribute to gestational length and preterm birth, robust associations with genetic variants have not been identified. We hypothesized that analyzing larger data sets with gestational length information by genomewide association would reveal trait-influencing variants.MethodsWe performed a genomewide association study in a discovery data set of 43,568 women of European ancestry from 23andMe, Inc., for gestational length as a continuous trait and for term or preterm (<37 weeks) birth as a dichotomous outcome. We used three Nordic data sets (8,643 women) for replication of 14 genomic loci achieving either genomewide (P < 5×10-8) or suggestive association (P < 1×10-6).ResultsIn the discovery stage, for gestational length, four loci (EBF1, EEFSEC, AGTR2 and WNT4) achieved genomewide significance, all of which were replicated in the Nordic data sets. Functional analysis of the WNT4 locus indicated the likely causative variant alters the binding of ESR1. ADCY5 and RAP2C, which had suggestive significance in the discovery stage, were significantly replicated and achieved genomewide significance in joint analysis. Common variants in EBF1, EEFSEC and AGTR2 were also associated with preterm birth with genomewide significance. Analysis of mother-infant dyads indicated that these findings likely resulted from maternal genome actions.ConclusionsOur study is the first to identify maternal genetic variants robustly associated with gestational length and preterm birth. Roles of these loci in uterine development, maternal nutrition, and vascular control support their mechanistic involvement and create opportunities to investigate new risk factors for prevention of preterm birth.
Background Eosinophilic esophagitis (EoE) is a chronic antigen-driven allergic inflammatory disease, likely involving the interplay of genetic and environmental factors, yet their respective contributions to heritability are unknown. Objective To quantify risk associated with genes and environment on familial clustering of EoE. Methods Family history was obtained from a hospital-based cohort of 914 EoE probands, (n=2192 first-degree “Nuclear-Family” relatives) and the new international registry of monozygotic and dizygotic twins/triplets (n=63 EoE “Twins” probands). Frequencies, recurrence risk ratios (RRRs), heritability and twin concordance were estimated. Environmental exposures were preliminarily examined. Results Analysis of the Nuclear-Family–based cohort revealed that the rate of EoE, in first-degree relatives of a proband, was 1.8% (unadjusted) and 2.3% (sex-adjusted). RRRs ranged from 10–64, depending on the family relationship, and were higher in brothers (64.0; p=0.04), fathers (42.9; p=0.004) and males (50.7; p<0.001) compared to sisters, mothers and females, respectively. Risk of EoE for other siblings was 2.4%. In the Nuclear-Families, combined gene and common environment heritability (hgc2) was 72.0±2.7% (p<0.001). In the Twins cohort, genetic heritability was 14.5±4.0% (p<0.001), and common family environment contributed 81.0±4% (p<0.001) to phenotypic variance. Proband-wise concordance in MZ co-twins was 57.9±9.5% compared to 36.4±9.3% in DZ (p=0.11). Greater birth-weight difference between twins (p=0.01), breastfeeding (p=0.15) and Fall birth season (p=0.02) were associated with twin discordance in disease status. Conclusions EoE recurrence risk ratios are increased 10–64-fold compared with the general population. EoE in relatives is 1.8–2.4%, depending upon relationship and sex. Nuclear-Family heritability appeared to be high (72.0%). However, Twins cohort analysis revealed a powerful role for common environment (81.0%) compared with additive genetic heritability (14.5%).
The electronic MEdical Records and GEnomics (eMERGE) network brings together DNA biobanks linked to electronic health records (EHRs) from multiple institutions. Approximately 51,000 DNA samples from distinct individuals have been genotyped using genome-wide SNP arrays across the nine sites of the network. The eMERGE Coordinating Center and the Genomics Workgroup developed a pipeline to impute and merge genomic data across the different SNP arrays to maximize sample size and power to detect associations with a variety of clinical endpoints. The 1000 Genomes cosmopolitan reference panel was used for imputation. Imputation results were evaluated using the following metrics: accuracy of imputation, allelic R2 (estimated correlation between the imputed and true genotypes), and the relationship between allelic R2 and minor allele frequency. Computation time and memory resources required by two different software packages (BEAGLE and IMPUTE2) were also evaluated. A number of challenges were encountered due to the complexity of using two different imputation software packages, multiple ancestral populations, and many different genotyping platforms. We present lessons learned and describe the pipeline implemented here to impute and merge genomic data sets. The eMERGE imputed dataset will serve as a valuable resource for discovery, leveraging the clinical data that can be mined from the EHR.
ObjectiveSystemic lupus erythematosus (SLE), an autoimmune disorder, has been associated with nearly 100 susceptibility loci. Nevertheless, these loci only partially explain SLE heritability and their putative causal variants are rarely prioritised, which make challenging to elucidate disease biology. To detect new SLE loci and causal variants, we performed the largest genome-wide meta-analysis for SLE in East Asian populations.MethodsWe newly genotyped 10 029 SLE cases and 180 167 controls and subsequently meta-analysed them jointly with 3348 SLE cases and 14 826 controls from published studies in East Asians. We further applied a Bayesian statistical approach to localise the putative causal variants for SLE associations.ResultsWe identified 113 genetic regions including 46 novel loci at genome-wide significance (p<5×10−8). Conditional analysis detected 233 association signals within these loci, which suggest widespread allelic heterogeneity. We detected genome-wide associations at six new missense variants. Bayesian statistical fine-mapping analysis prioritised the putative causal variants to a small set of variants (95% credible set size ≤10) for 28 association signals. We identified 110 putative causal variants with posterior probabilities ≥0.1 for 57 SLE loci, among which we prioritised 10 most likely putative causal variants (posterior probability ≥0.8). Linkage disequilibrium score regression detected genetic correlations for SLE with albumin/globulin ratio (rg=−0.242) and non-albumin protein (rg=0.238).ConclusionThis study reiterates the power of large-scale genome-wide meta-analysis for novel genetic discovery. These findings shed light on genetic and biological understandings of SLE.
Objective More than 80% of autoimmune disease is female dominant, but the mechanism for this female bias is poorly understood. We suspected an X chromosome dose effect and hypothesized that trisomy X (47,XXX , 1 in ~1,000 live female births) would be increased in female predominant diseases (e.g. systemic lupus erythematosus [SLE], primary Sjögren’s syndrome [SS], primary biliary cirrhosis [PBC] and rheumatoid arthritis [RA]) compared to diseases without female predominance (sarcoidosis) and controls. Methods We identified 47,XXX subjects using aggregate data from single nucleotide polymorphism (SNP) arrays and confirmed, when possible, by fluorescent in situ hybridization (FISH) or quantitative polymerase chain reaction (q-PCR). Results We found 47,XXX in seven of 2,826 SLE and three of 1,033 SS female patients, but only in two of the 7,074 female controls (p=0.003, OR=8.78, 95% CI: 1.67-86.79 and p=0.02, OR=10.29, 95% CI: 1.18-123.47; respectively). One 47,XXX subject was present for ~404 SLE women and ~344 SS women. 47,XXX was present in excess among SLE and SS subjects. Conclusion The estimated prevalence of SLE and SS in women with 47,XXX was respectively ~2.5 and ~2.9 times higher than in 46,XX women and ~25 and ~41 times higher than in 46,XY men. No statistically significant increase of 47,XXX was observed in other female-biased diseases (PBC or RA), supporting the idea of multiple pathways to sex bias in autoimmunity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.