With the progressive development of academic research and the increasing standard of living, the need for medical entity identification is gradually increasing. Medical entity recognition has been rapidly developed to help build medical knowledge graphs and disease prediction, improve diagnostic accuracy, enable disease prevention, simplify clinical decision-making and reduce healthcare costs. In Chinese named entity recognition, the input is a word-level vector representation, but in Chinese text, words are the smallest units that express meaning. Although the BERT model has the advantage of excellent training results and can avoid the noise generated by the different word-level vector when the input word-level vector is used, it is wasteful for Chinese words that possess more information. Therefore, we propose a pre-training model based on Tag Embedding and Simple Lexicon word strengthening with word information fusion, which incorporates the word boundary information of the text into the more expressive text after encoding, increasing the amount of information it contains. The purpose of this study is to build a named entity recognition model using deep neural networks to improve the accuracy of medical-related entity recognition from the perspective of improving the information exploiting of medical data sets in the current era of big data informatics, and to contribute to related research.
With the development of Internet technology, more and more scholars are applying computer technology to research in the medical field. In this paper, we will investigate the named entity recognition method. Under the study of BERT-CRF model, we propose a named entity recognition model based on multi-task pre-training model with adversarial learning and network sharing and apply it to entity recognition in medical field with the aim of improving the accuracy of entity recognition in medical field. The model introduces multi-task joint learning and adversarial learning modules to improve the entity boundary effect and solve the noise problem of word boundary information, while achieving the purpose of information enhancement. On the CMeEE (Chinese Medical Entity Extraction) dataset, the model showed a significant improvement in accuracy, recall, and F1 score.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.