Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.748
|View full text |Cite
|
Sign up to set email alerts
|

A Generate-and-Rank Framework with Semantic Type Regularization for Biomedical Concept Normalization

Abstract: Concept normalization, the task of linking textual mentions of concepts to concepts in an ontology, is challenging because ontologies are large. In most cases, annotated datasets cover only a small sample of the concepts, yet concept normalizers are expected to predict all concepts in the ontology. In this paper, we propose an architecture consisting of a candidate generator and a list-wise ranker based on BERT. The ranker considers pairings of concept mentions and candidate concepts, allowing it to make predi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 31 publications
(9 citation statements)
references
References 35 publications
0
9
0
Order By: Relevance
“…Neural architectures have been widely used in recent state-of-the-art models for MCN from scientific texts, user reviews, social media texts and clinical notes ( Ji et al , 2020 ; Leaman and Lu, 2016 ; Li et al , 2017 , 2019 ; Miftahutdinov and Tutubalina, 2019 ; Sung et al , 2020 ; Xu et al , 2020 ; Zhao et al , 2019 ; Zhu et al , 2020 ). Most models share limitations regarding a supervised classification framework: (i) to retrieve concepts from a particular terminology for a given entity mention, models are required re-training, (ii) use additional classification or ranking layer, therefore, during inference compute all similarities between a given mention and all concept names from a dictionary through this layer and sort these scores in descending order.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Neural architectures have been widely used in recent state-of-the-art models for MCN from scientific texts, user reviews, social media texts and clinical notes ( Ji et al , 2020 ; Leaman and Lu, 2016 ; Li et al , 2017 , 2019 ; Miftahutdinov and Tutubalina, 2019 ; Sung et al , 2020 ; Xu et al , 2020 ; Zhao et al , 2019 ; Zhu et al , 2020 ). Most models share limitations regarding a supervised classification framework: (i) to retrieve concepts from a particular terminology for a given entity mention, models are required re-training, (ii) use additional classification or ranking layer, therefore, during inference compute all similarities between a given mention and all concept names from a dictionary through this layer and sort these scores in descending order.…”
Section: Related Workmentioning
confidence: 99%
“…For instance, Ji et al (2020) fine-tuned BERT with binary classifier layer. Xu et al (2020) adopted a BERT-based multi-class classifier to generate a list of candidate concepts for each mention, and a BERT-based list-wise classifier to select the most likely candidate. We note that this multi-class candidate generator will require re-training for cross-terminology mapping.…”
Section: Related Workmentioning
confidence: 99%
“…These fine-tuned versions of BERT-based models are often combined with various machine learning approach to deliver good performance in biomedical normalization tasks. Ji et al [39] applied an ensemble approach based on Lucene and a pair-wise BERT classifier, and Xu et al [40] also proposed a hybrid system based on Lucene or a multi-class BERT classifier for the candidate generation, and a list-wise BERT classifier for ranking. BIOSYN [41] utilized entity representation from the BERT-based model and developed a synonym marginalization method with marginal maximum likelihood.…”
Section: Related Workmentioning
confidence: 99%
“…[13] propose a multi-view convolutional neural network(CNN) and a multi-task framework to normalize both procedure and disease mentions. When the terminology knowledge base is large, [5,[15][16][17][18]21] propose a recall model to generate possible terminologies then followed by a rank model to sort. [16,17] first generate candidates by bm25, then rank the terminologies by CNN and Bert respectively.…”
Section: Medical Terminology Normalizationmentioning
confidence: 99%