Background: The text descriptions in electronic medical records are a rich source of information. We have developed a Health Information Text Extraction (HITEx) tool and used it to extract key findings for a research study on airways disease.
Laypersons ("consumers") often have difficulty finding, understanding, and acting on health information due to gaps in their domain knowledge. Ideally, consumer health vocabularies (CHVs) would reflect the different ways consumers express and think about health topics, helping to bridge this vocabulary gap. However, despite the recent research on mismatches between consumer and professional language (e.g., lexical, semantic, and explanatory), there have been few systematic efforts to develop and evaluate CHVs. This paper presents the point of view that CHV development is practical and necessary for extending research on informatics-based tools to facilitate consumer health information seeking, retrieval, and understanding. In support of the view, we briefly describe a distributed, bottom-up approach for (1) exploring the relationship between common consumer health expressions and professional concepts and (2) developing an open-access, preliminary (draft) "first-generation" CHV. While recognizing the limitations of the approach (e.g., not addressing psychosocial and cultural factors), we suggest that such exploratory research and development will yield insights into the nature of consumer health expressions and assist developers in creating tools and applications to support consumer health information seeking.
Abstract-Galactosidase (-gal) has been widely used as a transgene reporter enzyme, and several substrates are available for its in vitro detection. The ability to image -gal expression in living animals would further extend the use of this reporter. Here we show that DDAOG, a conjugate of -galactoside and 7-hydroxy-9H-(1,3-dichloro-9,9-dimethylacridin-2-one) (DDAO), is not only a chromogenic -gal substrate but that the cleavage product has far-red fluorescence properties detectable by imaging. Importantly, the cleavage substrate shows a 50-nm red shift, enabling its specific detection in a background of intact probe, a highly desirable feature for in vivo imaging. Specifically, we show that -gal-expressing 9L gliomas are readily detectable by red fluorescence imaging in comparison with the native 9L gliomas. We furthermore show that herpes simplex virus amplicon-mediated LacZ gene transfer into tumors can be transiently and thus serially visualized over time. The results indicated that in vivo real-time detection of -gal activity is possible by fluorescence imaging technology.
The Guideline Interchange Format (GLIF) is a model for representation of sharable computer-interpretable guidelines. The current version of GLIF (GLIF3) is a substantial update and enhancement of the model since the previous version (GLIF2). GLIF3 enables encoding of a guideline at three levels: a conceptual flowchart, a computable specification that can be verified for logical consistency and completeness, and an implementable specification that is intended to be incorporated into particular institutional information systems. The representation has been tested on a wide variety of guidelines that are typical of the range of guidelines in clinical use. It builds upon GLIF2 by adding several constructs that enable interpretation of encoded guidelines in computer-based decision-support systems. GLIF3 leverages standards being developed in Health Level 7 in order to allow integration of guidelines with clinical information systems. The GLIF3 specification consists of an extensible object-oriented model and a structured syntax based on the resource description framework (RDF). Empirical validation of the ability to generate appropriate recommendations using GLIF3 has been tested by executing encoded guidelines against actual patient data. GLIF3 is accordingly ready for broader experimentation and prototype use by organizations that wish to evaluate its ability to capture the logic of clinical guidelines, to implement them in clinical systems, and thereby to provide integrated decision support to assist clinicians.
Tens of thousands of subjects may be required to obtain reliable evidence relating disease characteristics to the weak effects typically reported from common genetic variants. The costs of assembling, phenotyping, and studying these large populations are substantial, recently estimated at three billion dollars for 500,000 individuals. They are also decade-long efforts. We hypothesized that automation and analytic tools can repurpose the informational byproducts of routine clinical care, bringing sample acquisition and phenotyping to the same high-throughput pace and commodity price-point as is currently true of genome-wide genotyping. Described here is a demonstration of the capability to acquire samples and data from densely phenotyped and genotyped individuals in the tens of thousands for common diseases (e.g., in a 1-yr period: N = 15,798 for rheumatoid arthritis; N = 42,238 for asthma; N = 34,535 for major depressive disorder) in one academic health center at an order of magnitude lower cost. Even for rare diseases caused by rare, highly penetrant mutations such as Huntington disease (N = 102) and autism (N = 756), these capabilities are also of interest.A common thread in the recent flurry of studies relating characteristics of complex diseases to the generally weak effects of individual genetic variants is that very large numbers of subjects are needed to obtain reproducible results-closer to 200,000 individuals (Manolio et al. 2006) than the few thousand typical of recent publications. The costs of assembling, phenotyping, and studying these huge populations are estimated at three billion dollars for 500,000 individuals (Spivey 2006). Reciprocally, studying rare diseases often requires searching through very large populations, and sufficient sample sizes are hard to achieve. Coincidentally, the United States spends over two trillion dollars in healthcare per year (Catlin et al. 2008), and of those costs, the total investment in information technology (IT) is at least seven billion dollars per year (Girosi et al. 2005). The stimulus package recently enacted by the U.S. Congress includes a very significant increase in spending on electronic health records, prompting interest in the secondary use of the data gathered in such records. Yet there is widespread, often justified skepticism about our ability to use routinely collected electronic health records (EHRs) for research-quality phenotype data, given the well-known biases and coarse-grained nature of billing/claims diagnoses and procedures (Safran 1991;Jollis et al. 1993). By the same measure, the consistency of phenotypic definitions in large genome-wide association studies (GWAS), especially when they consist of the aggregation of several existing studies, and the consequent effect upon these study results, has been questioned (Ioannidis 2007;Wojczynski and Tiwari 2008;Buyske et al. 2009).To meet these challenges, we have undertaken a series of institutional experiments that collectively demonstrate that automated systems for mining of EHRs are essentia...
Background Dementia is underdiagnosed in both the general population and among Veterans. This underdiagnosis decreases quality of life, reduces opportunities for interventions, and increases health-care costs. New approaches are therefore necessary to facilitate the timely detection of dementia. This study seeks to identify cases of undiagnosed dementia by developing and validating a weakly supervised machine-learning approach that incorporates the analysis of both structured and unstructured electronic health record (EHR) data. Methods A topic modeling approach that included latent Dirichlet allocation, stable topic extraction, and random sampling was applied to VHA EHRs. Topic features from unstructured data and features from structured data were compared between Veterans with ( n = 1861) and without ( n = 9305) ICD-9 dementia codes. A logistic regression model was used to develop dementia prediction scores, and manual reviews were conducted to validate the machine-learning results. Results A total of 853 features were identified (290 topics, 174 non-dementia ICD codes, 159 CPT codes, 59 medications, and 171 note types) for the development of logistic regression prediction scores. These scores were validated in a subset of Veterans without ICD-9 dementia codes ( n = 120) by experts in dementia who performed manual record reviews and achieved a high level of inter-rater agreement. The manual reviews were used to develop a receiver of characteristic (ROC) curve with different thresholds for case detection, including a threshold of 0.061, which produced an optimal sensitivity (0.825) and specificity (0.832). Conclusions Dementia is underdiagnosed, and thus, ICD codes alone cannot serve as a gold standard for diagnosis. However, this study suggests that imperfect data (e.g., ICD codes in combination with other EHR features) can serve as a silver standard to develop a risk model, apply that model to patients without dementia codes, and then select a case-detection threshold. The study is one of the first to utilize both structured and unstructured EHRs to develop risk scores for the diagnosis of dementia.
Genomic alterations of the epidermal growth factor receptor (EGFR) gene play a crucial role in pathogenesis of glioblastoma multiforme (GBM). By systematic analysis of GBM genomic data, we have identified and characterized a novel exon 27 deletion mutation occurring within the EGFR carboxyl-terminus domain (CTD) in addition to identifying additional examples of previously reported deletion mutations in this region. We show that the GBM-derived EGFR CTD deletion mutants are able to induce cellular transformation in vitro and in vivo in the absence of ligand and receptor autophosphorylation. Treatment with the EGFR-targeted monoclonal antibody, cetuximab, or the small molecule EGFR inhibitor, erlotinib, effectively impaired tumorigenicity of oncogenic EGFR CTD deletion mutants. Cetuximab in particular prolonged the survival of intracranially xenografted mice with oncogenic EGFR CTD deletion mutants, compared to untreated control mice. Therefore, we propose that erlotinib and especially cetuximab treatment may be a promising therapeutic strategy in GBM patients harboring EGFR CTD deletion mutants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.