Detecting negation and uncertainty is crucial for medical text mining applications; otherwise, extracted information can be incorrectly identified as real or factual events. Although several approaches have been proposed to detect negation and uncertainty in clinical texts, most efforts have focused on the English language. Most proposals developed for Spanish have focused mainly on negation detection and do not deal with uncertainty. In this paper, we propose a deep learning-based approach for both negation and uncertainty detection in clinical texts written in Spanish. The proposed approach explores two deep learning methods to achieve this goal: (i) Bidirectional Long-Short Term Memory with a Conditional Random Field layer (BiLSTM-CRF) and (ii) Bidirectional Encoder Representation for Transformers (BERT). The approach was evaluated using NUBES and IULA, two public corpora for the Spanish language. The results obtained showed an F-score of 92% and 80% in the scope recognition task for negation and uncertainty, respectively. We also present the results of a validation process conducted using a real-life annotated dataset from clinical notes belonging to cancer patients. The proposed approach shows the feasibility of deep learning-based methods to detect negation and uncertainty in Spanish clinical texts. Experiments also highlighted that this approach improves performance in the scope recognition task compared to other proposals in the biomedical domain.
Stream-mining approach is defined as a set of cutting-edge techniques designed to process streams of data in real time, in order to extract knowledge. In the particular case of classification, stream-mining has to adapt its behavior to the volatile underlying data distributions, what has been called concept drift. It is important to note that concept drift may lead to situations where predictive models become invalid and have therefore to be updated to represent the actual concepts that data poses. In this context, there is a specific type of concept drift, known as recurrent concept drift, where the concepts represented by data have already appeared in the past. In those cases the learning process could be saved or at least minimized by applying a previously trained model. To deal with the aforementioned scenario, meta-models can be used in the process of enhancing the drift detection mechanisms used by data stream algorithms, by representing and predicting when the change will occur. There are some real-world situations where a concept reappears, as in the case of intrusion detection systems (IDS), where the same incidents or an adaptation of them usually reappear over time. In these environments the early prediction of drift by means of a better knowledge of past models can help to anticipate to the change, thus improving efficiency of the model regarding the training instances needed. Furthermore, as a complement of meta-models, a mechanism to assess the similarity between classifica-tion models is also needed when dealing with recurrent concepts. In this context, when reusing a previously trained model a rough comparison between concepts is usually made, applying boolean logic. The intro-duction of fuzzy logic comparisons between models could lead to a better efficient reuse of previously seen concepts, by applying not just equal models, but also similar ones. This work faces the aforementioned open issues by means of the MM-PRec system, that integrates a meta-model mechanism and a fuzzy similarity function. The theoretical proposal of MM-PRec is also validated in this paper by means of different experiments using both synthetic and real datasets.
Background Since the onset of the pandemic, the unCoVer network has been identifying real-world data from EMR of hospitalised patients with COVID-19 across countries. These heterogeneous data are integrated into a multi-user data repository operated through Opal/DataSHIELD, an interoperable open-source server application, providing privacy-preserving access to individual-level information for federated data analyses. Methods unCoVer’s federated data platform provided access to EMR collected between 02/2020 - 04/2022 from 6 hospitals in Bosnia and Herzegovina (1), Romania (2), Spain (2), and Turkey (1) for a total of 14,236 patients. Demographics, and co-morbidities at admission, length of hospital stay and intensive care (ICU) needs, are presented according to the patients’ status at discharge. Results A total of 11,248 (79.0%) of all patients reviewed recovered from COVID-19 after an average 11.5 (SD 10.8) days hospitalised, with only 4.09% of patients needing ICU. A smaller proportion of patients were transferred (5.93%), and 2143 (15.1%) were considered in-hospital deaths after an average 11.6 (SD 10.5) days in the hospital where most (81.2%) needed ICU. Recovered patients had a mean age of 57.7 (SD 16.3) years old, and gender neutral (51.2% men), in contrast to deceased patients that were 74.2 (SD 12.4) years old (59.7% men). Current smoking was infrequent for both recovered or deceased patients (3.27%, and 2.83%, respectively). Cardiometabolic conditions were less commonly reported among later recovered patients in comparison with deceased patients: obesity (10.7% vs 12.1%), diabetes (15.9% vs 27.4%), hypertension (23.2% vs 42.7%), and CVD (9.33% vs 44.9%). Chronic pulmonary disease was also more frequent among deceased patients (10.3% vs 18.1%). Conclusions Characteristics of hospitalised COVID-19 patients differ according to outcomes at discharge with more in-hospital death reported among older, chronic patients across 6 hospitals in 4 countries. Key messages • Federated analyses provide unique opportunities for robust results by privacy-preserving accessing individual-level data from heterogeneous data sources. • The unCoVer network aims to demonstrate the usability of the infrastructure to address research questions related to the COVID-19 while extending the concept to other clinical areas.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.