Patient classification has widespread biomedical and clinical applications, including diagnosis, prognosis, and treatment response prediction. A clinically useful prediction algorithm should be accurate, generalizable, be able to integrate diverse data types, and handle sparse data. A clinical predictor based on genomic data needs to be interpretable to drive hypothesis‐driven research into new treatments. We describe netDx, a novel supervised patient classification framework based on patient similarity networks, which meets these criteria. In a cancer survival benchmark dataset integrating up to six data types in four cancer types, netDx significantly outperforms most other machine‐learning approaches across most cancer types. Compared to traditional machine‐learning‐based patient classifiers, netDx results are more interpretable, visualizing the decision boundary in the context of patient similarity space. When patient similarity is defined by pathway‐level gene expression, netDx identifies biological pathways important for outcome prediction, as demonstrated in breast cancer and asthma. netDx can serve as a patient classifier and as a tool for discovery of biological features characteristic of disease. We provide a free software implementation of netDx with automation workflows.
Contextual word embedding models, such as BioBERT and Bio_ClinicalBERT, have achieved state-of-the-art results in biomedical natural language processing tasks by focusing their pre-training process on domain-specific corpora. However, such models do not take into consideration structured expert domain knowledge from a knowledge base.We introduce UmlsBERT, a contextual embedding model that integrates domain knowledge during the pre-training process via a novel knowledge augmentation strategy. More specifically, the augmentation on UmlsBERT with the Unified Medical Language System (UMLS) Metathesaurus is performed in two ways: (i) connecting words that have the same underlying 'concept' in UMLS and (ii) leveraging semantic type knowledge in UMLS to create clinically meaningful input embeddings. By applying these two strategies, Umls-BERT can encode clinical domain knowledge into word embeddings and outperform existing domain-specific models on common namedentity recognition (NER) and clinical natural language inference tasks.
P resentation at research conferences serves as a valuable channel for individuals to share new knowledge and advances in a given clinical field. The ultimate goal of a conference presentation is usually subsequent manuscript publication. The process of submitting a manuscript for publication typically requires extensive peer review and revisions (1). Research quality plays a major role in whether a study is accepted by a journal. Therefore, assessing the rate at which presentations at major conferences are published can serve as a surrogate marker of the quality of conference presentations.Insight into publication rate raises the question of which factors, if any, increase the likelihood of a conference abstract ultimately being published in a peer-reviewed journal. For example, past studies have demonstrated publication bias, in which studies with positive or statistically significant results are more likely to be accepted by a journal compared with those with nonsignificant results (2).The publication rate of presentations at major conferences varies depending on the specialty and conference. Publication rates for various national surgical conferences have been shown to range from 36% to 65% (3-6). Other example publication rates include 32% for emergency medicine, 35%-50% for pediatrics, and 50% for cardiology (3, 7). A previous meta-analysis reviewed the number of abstracts that went on to publication for several specialty conferences and reported publication rates that varied from 32%-67% (3). I N T E R V E N T I O N A L R A D I O LO G Y O R I G I N A L A R T I C L E PURPOSEWe aimed to determine the publication rate and factors predictive of publication of oral presentations at the annual meetings of the Cardiovascular and Interventional Radiology Society of Europe (CIRSE) and the Society of Interventional Radiology (SIR). METHODSKeywords and authors from oral presentation abstracts at the 2012 CIRSE and SIR annual meetings were used to search PubMed and GoogleScholar for subsequent publication. Logistic regression was performed to identify whether number of authors, country of origin, subject category, methodology, study type, and/or study results were predictive of publication. RESULTSA total of 421 abstracts (CIRSE-126, SIR-295) met the inclusion criteria. The overall publication rate across both conferences was 44.9%. Time from conference presentation to publication was 15±8.9 months for CIRSE and 16.3±8.8 months for SIR (P > 0.05), with a combined time interval of 15.9±8.8 months for both. The median impact factor of published abstracts was 2.075 (interquartile range, 2.075-2.775) for CIRSE and 2.093 (2.075-2.856) for SIR (P > 0.05). The most common country of origin for published abstracts was Germany (27.1%) at CIRSE and the United States (69%) at SIR. Logistic regression did not identify factors that were predictive of future publication. CONCLUSIONPublication rates were similar for CIRSE and SIR. Factors such as country of origin, topic of study and study results were not predictive of future pub...
There have been many recently published studies exploring machine learning (ML) and deep learning applications within neuroradiology. The improvement in performance of these techniques has resulted in an ever-increasing number of commercially available tools for the neuroradiologist. In this narrative review, recent publications exploring ML in neuroradiology are assessed with a focus on several key clinical domains. In particular, major advances are reviewed in the context of: (1) intracranial hemorrhage detection, (2) stroke imaging, (3) intracranial aneurysm screening, (4) multiple sclerosis imaging, (5) neuro-oncology, (6) head and tumor imaging, and (7) spine imaging.
Patient classification based on clinical and genomic data will further the goal of precision medicine. Interpretability is of particular relevance for models based on genomic data, where sample sizes are relatively small (in the hundreds), increasing overfitting risk netDx is a machine learning method to integrate multi-modal patient data and build a patient classifier. Patient data are converted into networks of patient similarity, which is intuitive to clinicians who also use patient similarity for medical diagnosis. Features passing selection are integrated, and new patients are assigned to the class with the greatest profile similarity. netDx has excellent performance, outperforming most machine-learning methods in binary cancer survival prediction. It handles missing data – a common problem in real-world data – without requiring imputation. netDx also has excellent interpretability, with native support to group genes into pathways for mechanistic insight into predictive features. The netDx Bioconductor package provides multiple workflows for users to build custom patient classifiers. It provides turnkey functions for one-step predictor generation from multi-modal data, including feature selection over multiple train/test data splits. Workflows offer versatility with custom feature design, choice of similarity metric; speed is improved by parallel execution. Built-in functions and examples allow users to compute model performance metrics such as AUROC, AUPR, and accuracy. netDx uses RCy3 to visualize top-scoring pathways and the final integrated patient network in Cytoscape. Advanced users can build more complex predictor designs with functional building blocks used in the default design. Finally, the netDx Bioconductor package provides a novel workflow for pathway-based patient classification from sparse genetic data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.