Background
Bleeding events are common and critical and may cause significant morbidity and mortality. High incidences of bleeding events are associated with cardiovascular disease in patients on anticoagulant therapy. Prompt and accurate detection of bleeding events is essential to prevent serious consequences. As bleeding events are often described in clinical notes, automatic detection of bleeding events from electronic health record (EHR) notes may improve drug-safety surveillance and pharmacovigilance.
Objective
We aimed to develop a natural language processing (NLP) system to automatically classify whether an EHR note sentence contains a bleeding event.
Methods
We expert annotated 878 EHR notes (76,577 sentences and 562,630 word-tokens) to identify bleeding events at the sentence level. This annotated corpus was used to train and validate our NLP systems. We developed an innovative hybrid convolutional neural network (CNN) and long short-term memory (LSTM) autoencoder (HCLA) model that integrates a CNN architecture with a bidirectional LSTM (BiLSTM) autoencoder model to leverage large unlabeled EHR data.
Results
HCLA achieved the best area under the receiver operating characteristic curve (0.957) and F1 score (0.938) to identify whether a sentence contains a bleeding event, thereby surpassing the strong baseline support vector machines and other CNN and autoencoder models.
Conclusions
By incorporating a supervised CNN model and a pretrained unsupervised BiLSTM autoencoder, the HCLA achieved high performance in detecting bleeding events.
Neural machine translation (NMT) models have achieved state-of-the-art translation quality with a large quantity of parallel corpora available. However, their performance suffers significantly when it comes to domain-specific translations, in which training data are usually scarce. In this paper, we present a novel NMT model with a new word embedding transition technique for fast domain adaption. We propose to split parameters in the model into two groups: model parameters and meta parameters. The former are used to model the translation while the latter are used to adjust the representational space to generalize the model to different domains. We mimic the domain adaptation of the machine translation model to low-resource domains using multiple translation tasks on different domains. A new training strategy based on meta-learning is developed along with the proposed model to update the model parameters and meta parameters alternately. Experiments on datasets of different domains showed substantial improvements of NMT performances on a limited amount of data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.