BackgroundSurveying the scientific literature is an important part of early drug discovery; and with the ever-increasing amount of biomedical publications it is imperative to focus on the most interesting articles. Here we present a project that highlights new understanding (e.g. recently discovered modes of action) and identifies potential drug targets, via a novel, data-driven text mining approach to score type 2 diabetes (T2D) relevance. We focused on monitoring trends and jumps in T2D relevance to help us be timely informed of important breakthroughs. MethodsWe extracted over 7 million n-grams from PubMed abstracts and then clustered around 240,000 linked to T2D into almost 50,000 T2D relevant 'semantic concepts'. To score papers, we weighted the concepts based on co-mentioning with core T2D proteins. A protein's T2D relevance was determined by combining the scores of the papers mentioning it in the five preceding years. Each week all proteins were ranked according to their T2D relevance. Furthermore, the historical distribution of changes in rank from one week to the next was used to calculate the significance of a change in rank by T2D relevance for each protein. ResultsWe show that T2D relevant papers, even those not mentioning T2D explicitly, were prioritised by relevant semantic concepts. Well known T2D proteins were therefore enriched among the top scoring proteins. Our 'high jumpers' identified important past developments
Background: Medication errors have been identified as the most common preventable cause of adverse events. The lack of granularity in medication error terminology has led pharmacovigilance experts to rely on information in individual case safety reports' (ICSRs) codes and narratives for signal detection, which is both time consuming and labour intensive. Thus, there is a need for complementary methods for the detection of medication errors from ICSRs. The aim of this study is to evaluate the utility of two natural language processing text mining methods as complementary tools to the traditional approach followed by pharmacovigilance experts for medication error signal detection. Methods: The safety surveillance advisor (SSA) method, I2E text mining and University of Copenhagen Center for Protein Research (CPR) text mining, were evaluated for their ability to extract cases containing a type of medication error where patients extracted insulin from a prefilled pen or cartridge by a syringe. A total of 154,209 ICSRs were retrieved from Novo Nordisk's safety database from January 1987 to February 2018. Each method was evaluated by recall (sensitivity) and precision (positive predictive value). Results: We manually annotated 2533 ICSRs to investigate whether these contained the sought medication error. All these ICSRs were then analysed using the three methods. The recall was 90.4, 88.1 and 78.5% for the CPR text mining, the SSA method and the I2E text mining, respectively. Precision was low for all three methods ranging from 3.4% for the SSA method to 1.9 and 1.6% for the CPR and I2E text mining methods, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.