Motivation: Emergence of genetic data coupled to longitudinal electronic medical records (EMRs) offers the possibility of phenome-wide association scans (PheWAS) for disease–gene associations. We propose a novel method to scan phenomic data for genetic associations using International Classification of Disease (ICD9) billing codes, which are available in most EMR systems. We have developed a code translation table to automatically define 776 different disease populations and their controls using prevalent ICD9 codes derived from EMR data. As a proof of concept of this algorithm, we genotyped the first 6005 European–Americans accrued into BioVU, Vanderbilt's DNA biobank, at five single nucleotide polymorphisms (SNPs) with previously reported disease associations: atrial fibrillation, Crohn's disease, carotid artery stenosis, coronary artery disease, multiple sclerosis, systemic lupus erythematosus and rheumatoid arthritis. The PheWAS software generated cases and control populations across all ICD9 code groups for each of these five SNPs, and disease-SNP associations were analyzed. The primary outcome of this study was replication of seven previously known SNP–disease associations for these SNPs.Results: Four of seven known SNP–disease associations using the PheWAS algorithm were replicated with P-values between 2.8 × 10−6 and 0.011. The PheWAS algorithm also identified 19 previously unknown statistical associations between these SNPs and diseases at P < 0.01. This study indicates that PheWAS analysis is a feasible method to investigate SNP–disease associations. Further evaluation is needed to determine the validity of these associations and the appropriate statistical thresholds for clinical significance.Availability:The PheWAS software and code translation table are freely available at http://knowledgemap.mc.vanderbilt.edu/research.Contact: josh.denny@vanderbilt.edu
Candidate gene and genome-wide association studies (GWAS) have identified genetic variants that modulate risk for human disease; many of these associations require further study to replicate the results. Here we report the first large-scale application of the phenome-wide association study (PheWAS) paradigm within electronic medical records (EMRs), an unbiased approach to replication and discovery that interrogates relationships between targeted genotypes and multiple phenotypes. We scanned for associations between 3,144 single-nucleotide polymorphisms (previously implicated by GWAS as mediators of human traits) and 1,358 EMR-derived phenotypes in 13,835 individuals of European ancestry. This PheWAS replicated 66% (51/77) of sufficiently powered prior GWAS associations and revealed 63 potentially pleiotropic associations with P < 4.6 × 10−6 (false discovery rate < 0.1); the strongest of these novel associations were replicated in an independent cohort (n = 7,406). These findings validate PheWAS as a tool to allow unbiased interrogation across multiple phenotypes in EMR-based cohorts and to enhance analysis of the genomic basis of human disease.
The promise of “personalized medicine” guided by an understanding of each individual’s genome has been fostered by increasingly powerful and economical methods to acquire clinically relevant features. We describe operational implementation of prospective genotyping linked to an advanced clinical decision support system to guide individualized healthcare in a large academic health center. This approach to personalized medicine includes patient and healthcare provider engagement, identifying relevant genetic variation for implementation, assay reliability, point-of-care decision support, and necessary institutional investments. In one year, approximately 3,000 patients, most scheduled for cardiac catheterization, were genotyped on a multiplexed platform including CYP2C19 variants that modulate response to the widely-used antiplatelet drug clopidogrel. These data are deposited into the Electronic Medical Record and point-of-care decision support is deployed when clopidogrel is prescribed for those with variant genotypes. The establishment of programs such as this is a first step toward implementing and evaluating strategies for personalized medicine.
Since September 2010, over 10,000 patients have undergone preemptive, panel-based pharmacogenomic testing through the Vanderbilt Pharmacogenomic Resource for Enhanced Decisions in Care and Treatment (PREDICT) program. Analysis of the genetic data from the first 9,589 individuals reveals the frequency of genetic variants is concordant with published allele frequencies. Based on five currently implemented drug-genome interactions, the multiplexed test identified one or more actionable variants in 91% of the genotyped patients and in 96% of African-American patients. Using medication exposure data from electronic medical records, we compared a theoretical “reactive,” prescription-triggered, serial single-gene testing strategy to our preemptive, multiplexed genotyping approach. Reactive genotyping would have generated 14,656 genetic tests. These data highlight three advantages of preemptive genotyping: 1)the vast majority of patients carry at least one pharmacogene variant; 2)data are available at the point of care; and 3)there is a substantial reduction in testing burden compared to a reactive strategy.
The authors designed ResearchMatch, a disease-neutral, web-based recruitment registry to help match individuals who wish to participate in clinical research studies with researchers actively searching for volunteers throughout the United States. In this article, they describe ResearchMatch’s stakeholders, workflow model, technical infrastructure, and, for the registry’s first 19 months of operation, utilization metrics. Having launched volunteer registration tools in November 2009 and researcher registration tools in March 2010, ResearchMatch had, as of June 2011, registered 15,871 volunteer participants from all 50 states. The registry was created as a collaborative project for institutions in the Clinical and Translational Science Awards (CTSA) consortium. Also as of June 2011, a total of 751 researchers from 61 participating CTSA institutions had registered to use the tool to recruit participants into 540 active studies and trials. ResearchMatch has proven successful in connecting volunteers with researchers, and the authors are currently evaluating regulatory and workflow options to open access to researchers at non-CTSA institutions.
Large-scale DNA databanks linked to electronic medical record (EMR) systems have been proposed as an approach for rapidly generating large, diverse cohorts for discovery and replication of genotype-phenotype associations. However, the extent to which such resources are capable of delivering on this promise is unknown. We studied whether an EMR-linked DNA biorepository can be used to detect known genotype-phenotype associations for five diseases. Twenty-one SNPs previously implicated as common variants predisposing to atrial fibrillation, Crohn disease, multiple sclerosis, rheumatoid arthritis, or type 2 diabetes were successfully genotyped in 9483 samples accrued over 4 mo into BioVU, the Vanderbilt University Medical Center DNA biobank. Previously reported odds ratios (OR(PR)) ranged from 1.14 to 2.36. For each phenotype, natural language processing techniques and billing-code queries were used to identify cases (n = 70-698) and controls (n = 808-3818) from deidentified health records. Each of the 21 tests of association yielded point estimates in the expected direction. Previous genotype-phenotype associations were replicated (p < 0.05) in 8/14 cases when the OR(PR) was > 1.25, and in 0/7 with lower OR(PR). Statistically significant associations were detected in all analyses that were adequately powered. In each of the five diseases studied, at least one previously reported association was replicated. These data demonstrate that phenotypes representing clinical diagnoses can be extracted from EMR systems, and they support the use of DNA resources coupled to EMR systems as tools for rapid generation of large data sets required for replication of associations found in research cohorts and for discovery in genome science.
BioVU, the Vanderbilt DNA Databank, is one of few biobanks that qualifies as non‐human subjects research as determined by the local IRB and the federal Office of Human Research Protections (OHRP). BioVU accrues DNA samples extracted from leftover blood remaining from routine clinical testing. The resource is linked to a de‐identified version of data extracted from an Electronic Medical Record (EMR) system, termed the Synthetic Device (SD), in which all personal identifiers have been removed. Thus, there is no identifiable private information attached to the records. The Belmont Report enumerates the importance of the boundary between practice and research, and three principles: Respect for Persons, Beneficence, and Justice, which constitute the essential ethical framework by which IRBs and ethics committees judge the risks and benefi ts of research involving human subjects. BioVU was developed by designing and implementing new procedures, for which there were no previously established methods, which are consistent with the principles of the Belmont Report. These included special oversight and governance, new informatics technologies, provisions to accommodate patients’ preferences, as well as an extensive public education and communications component. Considerations of core principles and protections in the practical implementation of BioVU is the focus of this paper. Clin Trans Sci 2010; Volume #: 1–7
Routine integration of genotype data into drug decision-making could improve patient safety, particularly if many relevant genetic variants can be assayed simultaneously before target drug prescribing. The frequency of pharmacogenetic prescribing opportunities and the potential adverse events (AE) mitigated are unknown. We examined the frequency with which 56 medications with known outcomes influenced by variant alleles were prescribed in a cohort of 52,942 medical home patients at Vanderbilt University Medical Center. Within a five-year window, we estimated that 64.8% (95% CI: 64.4%-65.2%) of individuals were exposed to at least one medication with an established pharmacogenetic association. Using previously published results for six medications with well-characterized, severe genetically-linked AEs, we estimated that 398 events (95% CI, 225 - 583) could have been prevented with an effective preemptive genotyping program. Our results suggest that multiplexed, preemptive genotyping may represent an efficient alternative approach to current single use (“reactive”) methods and may improve safety.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.