Medical named entity recognition (NER) and normalization (NEN) are fundamental for constructing knowledge graphs and building QA systems. Existing implementations for medical NER and NEN are suffered from the error propagation between the two tasks. The mispredicted mentions from NER will directly influence the results of NEN. Therefore, the NER module is the bottleneck of the whole system. Besides, the learnable features for both tasks are beneficial to improving the model performance. To avoid the disadvantages of existing models and exploit the generalized representation across the two tasks, we design an end-to-end progressive multi-task learning model for jointly modeling medical NER and NEN in an effective way. There are three level tasks with progressive difficulty in the framework. The progressive tasks can reduce the error propagation with the incremental task settings which implies the lower level tasks gain the supervised signals other than errors from the higher level tasks to improve their performances. Besides, the context features are exploited to enrich the semantic information of entity mentions extracted by NER. The performance of NEN profits from the enhanced entity mention features. The standard entities from knowledge bases are introduced into the NER module for extracting corresponding entity mentions correctly. The empirical results on two publicly available medical literature datasets demonstrate the superiority of our method over nine typical methods.
Automated medical named entity recognition and normalization are fundamental for constructing knowledge graphs and building QA systems. When it comes to medical text, the annotation demands a foundation of expertise and professionalism. Existing methods utilize active learning to reduce costs in corpus annotation, as well as the multi-task learning strategy to model the correlations between different tasks. However, existing models do not take task-specific features for different tasks and diversity of query samples into account. To address these limitations, this paper proposes a multi-task adversarial active learning model for medical named entity recognition and normalization. In our model, the adversarial learning keeps the effectiveness of multi-task learning module and active learning module. The task discriminator eliminates the influence of irregular task-specific features. And the diversity discriminator exploits the heterogeneity between samples to meet the diversity constraint. The empirical results on two medical benchmarks demonstrate the effectiveness of our model against the existing methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.