Unsupervised Domain Adaptation for Clinical Negation Detection

Miller, Timothy A.; Bethard, Steven; Amiri, Hadi; Savova, Guergana

doi:10.18653/v1/w17-2320

Cited by 10 publications

(10 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For the in-domain but out-of-sample case, a domain fine-tuned rule based system seems to transfer well (Sykes et al, 2020). For all other cases, transfer is challenging, both for rule-based and machine-learning models (Wu et al, 2014;Miller et al, 2017;Sykes et al, 2020), with machine learning models benefiting from the addition of in-domain data to the training set. Lin et al (2020) demonstrate that a pretrained BERT model can improve the results of domain transfer for negation detection, but the results are still lower for outof-domain datasets than in-domain datasets if we compare to the results of earlier models in Miller et al (2017).…”

Section: Related Workmentioning

confidence: 99%

“…Despite the amount of progress on negation detection for clinical texts, however, there is still ample evidence that while fitting systems on a particular dataset is straightforward, generalising negation detection across datasets is challenging (Wu et al, 2014). This is true both for out-of-domain evaluation, such as training on a dataset of medical articles with evaluation on a dataset of clinical text (Wu et al, 2014;Miller et al, 2017), as well as for out-of-sample evaluation, where the training and test datasets are from the same domain but may have differences due to different annotation style, or distribution of named entities (Sykes et al, 2020). For the in-domain but out-of-sample case, a domain fine-tuned rule based system seems to transfer well (Sykes et al, 2020).…”

Section: Related Workmentioning

confidence: 99%

“…For all other cases, transfer is challenging, both for rule-based and machine-learning models (Wu et al, 2014;Miller et al, 2017;Sykes et al, 2020), with machine learning models benefiting from the addition of in-domain data to the training set. Lin et al (2020) demonstrate that a pretrained BERT model can improve the results of domain transfer for negation detection, but the results are still lower for outof-domain datasets than in-domain datasets if we compare to the results of earlier models in Miller et al (2017). In our work we concur with previous findings: our neural models do not generalise negation detection across datasets, despite both datasets comprising radiology reports with stroke findings, such as acute ischemic stroke (AIS).…”

Section: Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Not a cute stroke: Analysis of Rule- and Neural Network-based Information Extraction Systems for Brain Radiology Reports

Grivas¹,

Alex²,

Grover³

et al. 2020

Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis

View full text Add to dashboard Cite

We present an in-depth comparison of three clinical information extraction (IE) systems designed to perform entity recognition and negation detection on brain imaging reports: EdIE-R, a bespoke rule-based system, and two neural network models, EdIE-BiLSTM and EdIE-BERT, both multi-task learning models with a BiLSTM and BERT encoder respectively. We compare our models both on an in-sample and an out-of-sample dataset containing mentions of stroke findings and draw on our error analysis to suggest improvements for effective annotation when building clinical NLP models for a new domain. Our analysis finds that our rulebased system outperforms the neural models on both datasets and seems to generalise to the out-of-sample dataset. On the other hand, the neural models do not generalise negation to the out-of-sample dataset, despite metrics on the in-sample dataset suggesting otherwise.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Not a cute stroke: Analysis of Rule- and Neural Network-based Information Extraction Systems for Brain Radiology Reports

Grivas¹,

Alex²,

Grover³

et al. 2020

Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis

View full text Add to dashboard Cite

show abstract

“…The interpretable vignettes also revealed that classification of prostate cancer death was problematic when negation appeared in the text. Our bag-of-words feature representation would not be expected to handle negation, so the application of methods to detect negation in clinical text data (37,38) would likely boost performance. Off-the-shelf classifiers achieved good performance on the CAP dataset.…”

Section: Discussionmentioning

confidence: 99%

Predicting cause of death from free-text health summaries: development of an interpretable machine learning tool

McWilliams

Walsh

Huxor

et al. 2021

Preprint

View full text Add to dashboard Cite

Purpose: Accurately assigning cause of death is vital to understanding health outcomes in the population and improving health care provision. Cancer-specific cause of death is a key outcome in clinical trials, but assignment of cause of death from death certification is prone to misattribution, therefore can have an impact on cancer-specific trial mortality outcome measures. Methods: We developed an interpretable machine learning classifier to predict prostate cancer death from free-text summaries of medical history for prostate cancer patients (CAP). We developed visualisations to highlight the predictive elements of the free-text summaries. These were used by the project analysts to gain an insight of how the predictions were made. Results: Compared to independent human expert assignment, the classifier showed >90% accuracy in predicting prostate cancer death in test subset of the CAP dataset. Informal feedback suggested that these visualisations would require adaptation to be useful to clinical experts when assessing the appropriateness of these ML predictions in a clinical setting. Notably, key features used by the classifier to predict prostate cancer death and emphasised in the visualisations, were considered to be clinically important signs of progressing prostate cancer based on prior knowledge of the dataset. Conclusion: The results suggest that our interpretability approach improve analyst confidence in the tool, and reveal how the approach could be developed to produce a decision-support tool that would be useful to health care reviewers. As such, we have published the code on GitHub to allow others to apply our methodology to their data (https://zenodo.org/badge/latestdoi/294910364).

show abstract

“…Our method is therefore capable of learning from the large historic EMR, even if these datasets were not annotated for this purpose. This is important because FP classifiers trained on one dataset do not perform as well as those trained on in-domain data, 5,21 with a similar finding on a veterinary disease classification task. 22 Cheng et al 5 showed that a classifier trained to detect negation cues and scope on out-of-domain data in the form of human clinical notes performed similarly to the rule-based NegEx 10 algorithm.…”

Section: Introductionmentioning

confidence: 99%

Detecting false-positive disease references in veterinary clinical notes without manual annotations

et al. 2019

View full text Add to dashboard Cite

Clinicians often include references to diseases in clinical notes, which have not been diagnosed in their patients. For some diseases terms, the majority of disease references written in the patient notes may not refer to true disease diagnosis. These references occur because clinicians often use their clinical notes to speculate about disease existence (differential diagnosis) or to state that the disease has been ruled out. To train classifiers for disambiguating disease references, previous researchers built training sets by manually annotating sentences. We show how to create very large training sets without the need for manual annotation. We obtain state-of- the-art classification performance with a bidirectional long short-term memory model trained to distinguish disease references between patients with or without the disease diagnosis in veterinary clinical notes.

show abstract

Unsupervised Domain Adaptation for Clinical Negation Detection

Cited by 10 publications

References 16 publications

Not a cute stroke: Analysis of Rule- and Neural Network-based Information Extraction Systems for Brain Radiology Reports

Not a cute stroke: Analysis of Rule- and Neural Network-based Information Extraction Systems for Brain Radiology Reports

Predicting cause of death from free-text health summaries: development of an interpretable machine learning tool

Detecting false-positive disease references in veterinary clinical notes without manual annotations

Contact Info

Product

Resources

About