Abstract:Detecting negated concepts in clinical texts is an important part of NLP information extraction systems. However, generalizability of negation systems is lacking, as cross-domain experiments suffer dramatic performance losses. We examine the performance of multiple unsupervised domain adaptation algorithms on clinical negation detection, finding only modest gains that fall well short of in-domain performance.
“…For the in-domain but out-of-sample case, a domain fine-tuned rule based system seems to transfer well (Sykes et al, 2020). For all other cases, transfer is challenging, both for rule-based and machine-learning models (Wu et al, 2014;Miller et al, 2017;Sykes et al, 2020), with machine learning models benefiting from the addition of in-domain data to the training set. Lin et al (2020) demonstrate that a pretrained BERT model can improve the results of domain transfer for negation detection, but the results are still lower for outof-domain datasets than in-domain datasets if we compare to the results of earlier models in Miller et al (2017).…”
Section: Related Workmentioning
confidence: 99%
“…Despite the amount of progress on negation detection for clinical texts, however, there is still ample evidence that while fitting systems on a particular dataset is straightforward, generalising negation detection across datasets is challenging (Wu et al, 2014). This is true both for out-of-domain evaluation, such as training on a dataset of medical articles with evaluation on a dataset of clinical text (Wu et al, 2014;Miller et al, 2017), as well as for out-of-sample evaluation, where the training and test datasets are from the same domain but may have differences due to different annotation style, or distribution of named entities (Sykes et al, 2020). For the in-domain but out-of-sample case, a domain fine-tuned rule based system seems to transfer well (Sykes et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…For all other cases, transfer is challenging, both for rule-based and machine-learning models (Wu et al, 2014;Miller et al, 2017;Sykes et al, 2020), with machine learning models benefiting from the addition of in-domain data to the training set. Lin et al (2020) demonstrate that a pretrained BERT model can improve the results of domain transfer for negation detection, but the results are still lower for outof-domain datasets than in-domain datasets if we compare to the results of earlier models in Miller et al (2017). In our work we concur with previous findings: our neural models do not generalise negation detection across datasets, despite both datasets comprising radiology reports with stroke findings, such as acute ischemic stroke (AIS).…”
We present an in-depth comparison of three clinical information extraction (IE) systems designed to perform entity recognition and negation detection on brain imaging reports: EdIE-R, a bespoke rule-based system, and two neural network models, EdIE-BiLSTM and EdIE-BERT, both multi-task learning models with a BiLSTM and BERT encoder respectively. We compare our models both on an in-sample and an out-of-sample dataset containing mentions of stroke findings and draw on our error analysis to suggest improvements for effective annotation when building clinical NLP models for a new domain. Our analysis finds that our rulebased system outperforms the neural models on both datasets and seems to generalise to the out-of-sample dataset. On the other hand, the neural models do not generalise negation to the out-of-sample dataset, despite metrics on the in-sample dataset suggesting otherwise.
“…For the in-domain but out-of-sample case, a domain fine-tuned rule based system seems to transfer well (Sykes et al, 2020). For all other cases, transfer is challenging, both for rule-based and machine-learning models (Wu et al, 2014;Miller et al, 2017;Sykes et al, 2020), with machine learning models benefiting from the addition of in-domain data to the training set. Lin et al (2020) demonstrate that a pretrained BERT model can improve the results of domain transfer for negation detection, but the results are still lower for outof-domain datasets than in-domain datasets if we compare to the results of earlier models in Miller et al (2017).…”
Section: Related Workmentioning
confidence: 99%
“…Despite the amount of progress on negation detection for clinical texts, however, there is still ample evidence that while fitting systems on a particular dataset is straightforward, generalising negation detection across datasets is challenging (Wu et al, 2014). This is true both for out-of-domain evaluation, such as training on a dataset of medical articles with evaluation on a dataset of clinical text (Wu et al, 2014;Miller et al, 2017), as well as for out-of-sample evaluation, where the training and test datasets are from the same domain but may have differences due to different annotation style, or distribution of named entities (Sykes et al, 2020). For the in-domain but out-of-sample case, a domain fine-tuned rule based system seems to transfer well (Sykes et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…For all other cases, transfer is challenging, both for rule-based and machine-learning models (Wu et al, 2014;Miller et al, 2017;Sykes et al, 2020), with machine learning models benefiting from the addition of in-domain data to the training set. Lin et al (2020) demonstrate that a pretrained BERT model can improve the results of domain transfer for negation detection, but the results are still lower for outof-domain datasets than in-domain datasets if we compare to the results of earlier models in Miller et al (2017). In our work we concur with previous findings: our neural models do not generalise negation detection across datasets, despite both datasets comprising radiology reports with stroke findings, such as acute ischemic stroke (AIS).…”
We present an in-depth comparison of three clinical information extraction (IE) systems designed to perform entity recognition and negation detection on brain imaging reports: EdIE-R, a bespoke rule-based system, and two neural network models, EdIE-BiLSTM and EdIE-BERT, both multi-task learning models with a BiLSTM and BERT encoder respectively. We compare our models both on an in-sample and an out-of-sample dataset containing mentions of stroke findings and draw on our error analysis to suggest improvements for effective annotation when building clinical NLP models for a new domain. Our analysis finds that our rulebased system outperforms the neural models on both datasets and seems to generalise to the out-of-sample dataset. On the other hand, the neural models do not generalise negation to the out-of-sample dataset, despite metrics on the in-sample dataset suggesting otherwise.
“…The interpretable vignettes also revealed that classification of prostate cancer death was problematic when negation appeared in the text. Our bag-of-words feature representation would not be expected to handle negation, so the application of methods to detect negation in clinical text data (37,38) would likely boost performance. Off-the-shelf classifiers achieved good performance on the CAP dataset.…”
Purpose: Accurately assigning cause of death is vital to understanding health outcomes in the population and improving health care provision. Cancer-specific cause of death is a key outcome in clinical trials, but assignment of cause of death from death certification is prone to misattribution, therefore can have an impact on cancer-specific trial mortality outcome measures.
Methods: We developed an interpretable machine learning classifier to predict prostate cancer death from free-text summaries of medical history for prostate cancer patients (CAP). We developed visualisations to highlight the predictive elements of the free-text summaries. These were used by the project analysts to gain an insight of how the predictions were made.
Results: Compared to independent human expert assignment, the classifier showed >90% accuracy in predicting prostate cancer death in test subset of the CAP dataset. Informal feedback suggested that these visualisations would require adaptation to be useful to clinical experts when assessing the appropriateness of these ML predictions in a clinical setting. Notably, key features used by the classifier to predict prostate cancer death and emphasised in the visualisations, were considered to be clinically important signs of progressing prostate cancer based on prior knowledge of the dataset.
Conclusion: The results suggest that our interpretability approach improve analyst confidence in the tool, and reveal how the approach could be developed to produce a decision-support tool that would be useful to health care reviewers. As such, we have published the code on GitHub to allow others to apply our methodology to their data (https://zenodo.org/badge/latestdoi/294910364).
“…Our method is therefore capable of learning from the large historic EMR, even if these datasets were not annotated for this purpose. This is important because FP classifiers trained on one dataset do not perform as well as those trained on in-domain data, 5,21 with a similar finding on a veterinary disease classification task. 22 Cheng et al 5 showed that a classifier trained to detect negation cues and scope on out-of-domain data in the form of human clinical notes performed similarly to the rule-based NegEx 10 algorithm.…”
Clinicians often include references to diseases in clinical notes, which have not been diagnosed in their patients. For some diseases terms, the majority of disease references written in the patient notes may not refer to true disease diagnosis. These references occur because clinicians often use their clinical notes to speculate about disease existence (differential diagnosis) or to state that the disease has been ruled out. To train classifiers for disambiguating disease references, previous researchers built training sets by manually annotating sentences. We show how to create very large training sets without the need for manual annotation. We obtain state-of- the-art classification performance with a bidirectional long short-term memory model trained to distinguish disease references between patients with or without the disease diagnosis in veterinary clinical notes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.