Proceedings of the 28th International Conference on Computational Linguistics 2020
DOI: 10.18653/v1/2020.coling-main.57
|View full text |Cite
|
Sign up to set email alerts
|

Enhancing Clinical BERT Embedding using a Biomedical Knowledge Base

Abstract: Domain knowledge is important for building Natural Language Processing (NLP) systems for low-resource settings, such as in the clinical domain. In this paper, a novel joint training method is introduced for adding knowledge base information from the Unified Medical Language System (UMLS) into language model pre-training for some clinical domain corpus. We show that in three different downstream clinical NLP tasks, our pre-trained language model outperforms the corresponding model with no knowledge base informa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
23
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 38 publications
(24 citation statements)
references
References 14 publications
1
23
0
Order By: Relevance
“…Which LM model? Several published works have found ClinicalBERT to outperform the other considered biomedical LMs on biomedical NLP tasks (Alsentzer et al, 2019;Kearns et al, 2019;Hao et al, 2020). In our results, however, SciBERT achieves the most consistent performance, clearly outperforming ClinicalBERT on the Procedures → Disease and Test → Disease categories, while performing similar to ClinicalBERT on the remaining categories.…”
Section: Discussionsupporting
confidence: 46%
See 1 more Smart Citation
“…Which LM model? Several published works have found ClinicalBERT to outperform the other considered biomedical LMs on biomedical NLP tasks (Alsentzer et al, 2019;Kearns et al, 2019;Hao et al, 2020). In our results, however, SciBERT achieves the most consistent performance, clearly outperforming ClinicalBERT on the Procedures → Disease and Test → Disease categories, while performing similar to ClinicalBERT on the remaining categories.…”
Section: Discussionsupporting
confidence: 46%
“…They obtained the best results with a BERT model that was pre-trained on PubMed abstracts and MIMIC-III clinical notes. Another large-scale evaluation of biomedical LMs has been carried out by Lewis et al (2020) (Michalopoulos et al, 2020;Hao et al, 2020).…”
Section: Related Work and Backgroundmentioning
confidence: 99%
“…OWL2Vec*, in collaboration with ZB MED -Information Centre for Life Sciences, also aims at being applied to identify clusters in an ontology and assign these clusters as topics (i.e., a set of ontology classes) to a corpus of documents to enhance the results of an information retrieval task (Ritchie et al, 2021). In addition, OWL2Vec*, as an ontology tailored word embedding model, could replace the original word embedding models to increase performance in some domain specific tasks such as biomedical text analysis (Hao et al, 2020). This is also a promising direction worth studying.…”
Section: Discussion and Outlookmentioning
confidence: 99%
“…Jointly optimizing the two objectives can implicitly integrate knowledge from external knowledge graphs into language models. Here we adopt the pre-trained Clinical KB-BERT (Hao et al, 2020) in our analysis. ClinicalBERT-EE-KB-MLM: In this method, we pre-train BERT with UMLS information with only the masked language model (MLM) objective.…”
Section: Clinicalbert-ee-kgementioning
confidence: 99%