Candidate gene and genome-wide association studies (GWAS) have identified genetic variants that modulate risk for human disease; many of these associations require further study to replicate the results. Here we report the first large-scale application of the phenome-wide association study (PheWAS) paradigm within electronic medical records (EMRs), an unbiased approach to replication and discovery that interrogates relationships between targeted genotypes and multiple phenotypes. We scanned for associations between 3,144 single-nucleotide polymorphisms (previously implicated by GWAS as mediators of human traits) and 1,358 EMR-derived phenotypes in 13,835 individuals of European ancestry. This PheWAS replicated 66% (51/77) of sufficiently powered prior GWAS associations and revealed 63 potentially pleiotropic associations with P < 4.6 × 10−6 (false discovery rate < 0.1); the strongest of these novel associations were replicated in an independent cohort (n = 7,406). These findings validate PheWAS as a tool to allow unbiased interrogation across multiple phenotypes in EMR-based cohorts and to enhance analysis of the genomic basis of human disease.
SUMMARYFOXA1, estrogen receptor a (ERa) and GATA3 independently predict favorable outcome in breast cancer patients, and their expression correlates with a differentiated, luminal tumor subtype. As transcription factors, each functions in the morphogenesis of various organs, with ERa and GATA3 being established regulators of mammary gland development. Interdependency between these three factors in breast cancer and normal mammary development has been suggested, but the specific role for FOXA1 is not known. Herein, we report that Foxa1 deficiency causes a defect in hormone-induced mammary ductal invasion associated with a loss of terminal end bud formation and ERa expression. By contrast, Foxa1 null glands maintain GATA3 expression. Unlike ERa and GATA3 deficiency, Foxa1 null glands form milk-producing alveoli, indicating that the defect is restricted to expansion of the ductal epithelium, further emphasizing the novel role for FOXA1 in mammary morphogenesis. Using breast cancer cell lines, we also demonstrate that FOXA1 regulates ERa expression, but not GATA3. These data reveal that FOXA1 is necessary for hormonal responsiveness in the developing mammary gland and ERa-positive breast cancers, at least in part, through its control of ERa expression.
Genetic association studies often examine features independently, potentially missing subpopulations with multiple phenotypes that share a single cause. We describe an approach that aggregates phenotypes on the basis of patterns described by Mendelian diseases. We mapped the clinical features of 1204 Mendelian diseases into phenotypes captured from the electronic health record (EHR) and summarized this evidence as phenotype risk scores (PheRSs). In an initial validation, PheRS distinguished cases and controls of five Mendelian diseases. Applying PheRS to 21,701 genotyped individuals uncovered 18 associations between rare variants and phenotypes consistent with Mendelian diseases. In 16 patients, the rare genetic variants were associated with severe outcomes such as organ transplants. PheRS can augment rare-variant interpretation and may identify subsets of patients with distinct genetic causes for common diseases.
Chronic obstructive pulmonary disease (COPD) is a common, complex disease associated with substantial morbidity and mortality. COPD is defined by irreversible airflow obstruction; airflow obstruction is typically determined by reductions in quantitative spirometric indices, including forced expiratory volume at 1 s (FEV(1)) and the ratio of FEV(1) to forced vital capacity (FVC). To identify genetic determinants of quantitative spirometric phenotypes, an autosomal 10-cM genomewide scan of short tandem repeat (STR) polymorphic markers was performed in 72 pedigrees (585 individuals) ascertained through probands with severe early-onset COPD. Multipoint variance-component linkage analysis (using SOLAR) was performed for quantitative phenotypes, including FEV(1), FVC, and FEV(1)/FVC. In the initial genomewide scan, significant evidence for linkage to FEV(1)/FVC was demonstrated on chromosome 2q (LOD score 4.12 at 222 cM). Suggestive evidence was found for linkage to FEV(1)/FVC on chromosomes 1 (LOD score 1.92 at 120 cM) and 17 (LOD score 2.03 at 67 cM) and to FVC on chromosome 1 (LOD score 2.05 at 13 cM). The highest LOD score for FEV(1) in the initial genomewide scan was 1.53, on chromosome 12, at 36 cM. After inclusion of 12 additional STR markers on chromosome 12p, which had been previously genotyped in this population, suggestive evidence for linkage of FEV(1) (LOD score 2.43 at 37 cM) to this region was demonstrated. These observations provide both significant evidence for an early-onset COPD-susceptibility locus on chromosome 2 and suggestive evidence for linkage of spirometry-related phenotypes to several other genomic regions. The significant linkage of FEV(1)/FVC to chromosome 2q could reflect one or more genes influencing the development of airflow obstruction or dysanapsis.
Background New drugs are routinely screened for acute IKr blocking properties thought to predict QT prolonging and arrhythmogenic liability. However, recent data suggest that chronic (hours) drug exposure to PI3 kinase (PI3K) inhibitors used in cancer can prolong QT by inhibiting potassium currents and increasing late sodium current (INa-L) in cardiomyocytes. We tested the extent to which IKr blockers with known QT liability generate arrhythmias through this pathway. Methods and Results Acute exposure to dofetilide, an IKr blocker without other recognized electropharmacologic actions, produced no change in ion currents or action potentials in adult mouse cardiomyocytes, which lack IKr. By contrast, 2–48 hours’ exposure to the drug generated arrhythmogenic afterdepolarizations and up to 15-fold increases in INa-L. Including PIP3, a downstream effector for the PI3K pathway, in the pipette inhibited these effects. INa-L was also increased, and inhibitable by PIP3, with hours of dofetilide exposure in human iPSC-derived cardiomyocytes and in CHO cells transfected with SCN5A, encoding INa. Cardiomyocytes from dofetilide-treated mice similarly demonstrated increased INa-L and afterdepolarizations. Other agents with variable IKr blocking potencies and arrhythmia liability produced a range of effects on INa-L, from marked increases (E-4031, d-sotalol, thioridazine, erythromycin) to little or no effect (haloperidol, moxifloxacin, verapamil). Conclusions Some but not all drugs designated as arrhythmogenic IKr blockers can generate arrhythmias by augmenting INa-L through the PI3K pathway. These data identify a potential mechanism for individual susceptibility to proarrhythmia and highlight the need for a new paradigm to screen drugs for QT prolonging and arrhythmogenic liability.
Background Electrocardiographic QRS duration, a measure of cardiac intraventricular conduction, varies ~2-fold in individuals without cardiac disease. Slow conduction may promote reentrant arrhythmias. Methods and Results We performed a genome-wide association study (GWAS) to identify genomic markers of QRS duration in 5,272 individuals without cardiac disease selected from electronic medical record (EMR) algorithms at five sites in the Electronic Medical Records and Genomics (eMERGE) network. The most significant loci were evaluated within the CHARGE consortium QRS GWAS meta-analysis. Twenty-three single nucleotide polymorphisms in 5 loci, previously described by CHARGE, were replicated in the eMERGE samples; 18 SNPs were in the chromosome 3 SCN5A and SCN10A loci, where the most significant SNPs were rs1805126 in SCN5A with p=1.2×10−8 (eMERGE) and p=2.5×10−20 (CHARGE) and rs6795970 in SCN10A with p=6×10−6 (eMERGE) and p=5×10−27 (CHARGE). The other loci were in NFIA, near CDKN1A, and near C6orf204. We then performed phenome-wide association studies (PheWAS) on variants in these five loci in 13,859 European Americans to search for diagnoses associated with these markers. PheWAS identified atrial fibrillation and cardiac arrhythmias as the most common associated diagnoses with SCN10A and SCN5A variants. SCN10A variants were also associated with subsequent development of atrial fibrillation and arrhythmia in the original 5,272 “heart-healthy” study population. Conclusions We conclude that DNA biobanks coupled to EMRs provide a platform not only for GWAS but may also allow broad interrogation of the longitudinal incidence of disease associated with genetic variants. The PheWAS approach implicated sodium channel variants modulating QRS duration in subjects without cardiac disease as predictors of subsequent arrhythmias.
The use of electronic medical record data linked to biological specimens in health care settings is expected to enable cost-effective and rapid genomic analyses. Here, we present a model that highlights potential advantages for genomic discovery and describe the operational infrastructure that facilitated multiple simultaneous discovery efforts.
While many phenotypes have been associated with variants in human leukocyte antigen (HLA) genes, the full phenotypic impact of HLA variants across all diseases is unknown. We imputed HLA genomic variation from two populations of 28,839 and 8,431 European ancestry individuals and tested association of HLA variation with 1,368 phenotypes. A total of 104 four-digit and 92 two-digit HLA allele-phenotype associations were significant in both discovery and replication cohorts, the strongest being HLA-DQB1*03:02 and type 1 diabetes. Four previously unidentified associations were identified across the spectrum of disease with two and four digit HLA alleles and ten with non-synonymous variants. Some conditions associated with multiple HLA variants and stronger associations with more severe disease manifestations were identified. A comprehensive, publicly-available catalog of clinical phenotypes associated HLA variation is provided. Examining HLA variant disease associations in this large dataset allows comprehensive definition of disease associations to drive further mechanistic insights.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.