The lack of standardized structure names in radiotherapy (RT) data limits interoperability, data sharing, and the ability to perform big data analysis. To standardize radiotherapy structure names, we developed an integrated natural language processing (NLP) and machine learning (ML) based system that can map the physician-given structure names to American Association of Physicists in Medicine (AAPM) Task Group 263 (TG-263) standard names. The dataset consist of 794 prostate and 754 lung cancer patients across the 40 different radiation therapy centers managed by the Veterans Health Administration (VA). Additionally, data from the Radiation Oncology department at Virginia Commonwealth University (VCU) was collected to serve as a test set. Domain experts identified as anatomically significant nine prostate and ten lung organs-at-risk (OAR) structures and manually labeled them according to the TG-263 standards, and remaining structures were labeled as Non_OAR. We experimented with six different classification algorithms and three feature vector methods, and the final model was built with fastText algorithm. Multiple validation techniques are used to assess the robustness of the proposed methodology. The macro-averaged F1 score was used as the main evaluation metric. The model achieved an F1 score of 0.97 on prostate structures and 0.99 for lung structures from the VA dataset. The model also performed well on the test (VCU) dataset, achieving an F1 score of 0.93 for prostate structures and 0.95 on lung structures. In this work, we demonstrate that NLP and ML based approaches can used to standardize the physician-given RT structure names with high fidelity. This standardization can help with big data analytics in the radiation therapy domain using population-derived datasets, including standardization of the treatment planning process, clinical decision support systems, treatment quality improvement programs, and hypothesis-driven clinical research.
The Radiotherapy Incident Reporting and Analysis System (RIRAS) receives incident reports from Radiation Oncology facilities across the US Veterans Health Affairs (VHA) enterprise and Virginia Commonwealth University (VCU). In this work, we propose a computational pipeline for analysis of radiation oncology incident reports. Our pipeline uses machine learning (ML) and natural language processing (NLP) based methods to predict the severity of the incidents reported in the RIRAS platform using the textual description of the reported incidents. These incidents in RIRAS are reviewed by a radiation oncology subject matter expert (SME), who initially triages some incidents based on the salient elements in the incident report. To automate the triage process, we used the data from the VHA treatment centers and the VCU radiation oncology department. We used NLP combined with traditional ML algorithms, including support vector machine (SVM) with linear kernel, and compared it against the transfer learning approach with the universal language model fine-tuning (ULMFiT) algorithm. In RIRAS, severities are divided into four categories; A, B, C, and D, with A being the most severe to D being the least. In this work, we built models to predict High (A & B) vs. Low (C & D) severity instead of all the four categories. Models were evaluated with macro-averaged precision, recall, and F1-Score. The Traditional ML machine learning (SVM-linear) approach did well on the VHA dataset with 0.78 F1-Score but performed poorly on the VCU dataset with 0.5 F1-Score. The transfer learning approach did well on both datasets with 0.81 F1-Score on VHA dataset and 0.68 F1-Score on the VCU dataset. Overall, our methods show promise in automating the triage and severity determination process from radiotherapy incident reports.
Rigorous radiotherapy quality surveillance and comprehensive outcome assessment require electronic capture and automatic abstraction of clinical, radiation treatment planning, and delivery data. We present the design and implementation framework of an integrated data abstraction, aggregation, and storage, curation, and analytics software: the Health Information Gateway and Exchange (HINGE), which collates data for cancer patients receiving radiotherapy. The HINGE software abstracts structured DICOM-RT data from the treatment planning system (TPS), treatment data from the treatment management system (TMS), and clinical data from the electronic health records (EHRs). HINGE software has disease site-specific "Smart" templates that facilitate the entry of relevant clinical information by physicians and clinical staff in a discrete manner as part of the routine clinical documentation.Radiotherapy data abstracted from these disparate sources and the smart templates are processed for quality and outcome assessment. The predictive data analyses are done on using well-defined clinical and dosimetry quality measures defined by disease site experts in radiation oncology. HINGE application software connects seamlessly to the local IT/medical infrastructure via interfaces and cloud services and performs data extraction and aggregation functions without human intervention. It provides tools to assess variations in radiation oncology practices and outcomes and determines gaps in radiotherapy quality delivered by each provider. K E Y W O R D S big data in radiation oncology, quality surveillance 1 | INTRODUCTION Advanced technologies in health care are bringing a sharper focus on clinical outcome assessment and the assessment of health care quality. Manual abstraction, collation, and subsequent analysis of health care quality from patient treatment and outcome data are onerous, expensive, and impractical. Advances in computer storage, computing power, and the ability to electronically mine data from disparate sources (e.g., demographics, genetics, imaging, treatment, clinical decisions, and outcomes) have enabled big data research in medicine. The evolution of several initiatives in the realm of interconnectivity of health care data sources and the availability of advanced computing frameworks have opened doors for answering a broad array of questions related to quality, safety, and outcomes of
Alzheimer’s disease (AD) and Parkinson’s disease (PD) are the most common neurodegenerative disorders related to aging. Though several risk factors are shared between these two diseases, the exact relationship between them is still unknown. In this paper, we analyzed how these two diseases relate to each other from the genomic, epigenomic, and transcriptomic viewpoints. Using an extensive literature mining, we first accumulated the list of genes from major genome-wide association (GWAS) studies. Based on these GWAS studies, we observed that only one gene (HLA-DRB5) was shared between AD and PD. A subsequent literature search identified a few other genes involved in these two diseases, among which SIRT1 seemed to be the most prominent one. While we listed all the miRNAs that have been previously reported for AD and PD separately, we found only 15 different miRNAs that were reported in both diseases. In order to get better insights, we predicted the gene co-expression network for both AD and PD using network analysis algorithms applied to two GEO datasets. The network analysis revealed six clusters of genes related to AD and four clusters of genes related to PD; however, there was very low functional similarity between these clusters, pointing to insignificant similarity between AD and PD even at the level of affected biological processes. Finally, we postulated the putative epigenetic regulator modules that are common to AD and PD.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.