Fang Han scite author profile

Fang Han

2Publications

2Citation Statements Received

29Citation Statements Given

How they've been cited

How they cite others

Affiliations

Xinjiang Institute of Engineering

Publications

Order By: Most citations

Multifeature Fusion Keyword Extraction Algorithm Based on TextRank

2022

View full text Add to dashboard Cite

Keyword extraction is the predecessor of many tasks, and its results directly affect search, recommendation, classification, and other tasks. In this study, we take Chinese text as the research object and propose a multi-feature fusion keyword extraction algorithm combined with BERT semantics and K-Truss graph(BSKT). The BSKT algorithm is based on the TextRank algorithm, which combines BERT semantic features, K-Truss features, and other features. First, the BSKT algorithm obtains the word vectors from the BERT pretraining model to calculate the semantic difference, which is used to optimize the iterative process of the TextRank word graph. Then, the BSKT algorithm obtains its K-Truss graph by decomposing the TextRank word graph and obtains the truss level feature of the word. Finally, by combining the word IDF and truss level features, the BSKT algorithm scores the words to extract keywords. Experimental results show that the BSKT algorithm achieves better performance than the latest keyword extraction algorithm SCTR in the task of extracting 1-10 keywords. Furthermore, the increment in F1 increased by 11.2% when the BSKT algorithm was used to extract three keywords from the Sensor dataset.

show abstract

Named Entity Recognition for Chinese Electronic Medical Records Based on Multitask and Transfer Learning

Guo

Han

2022

IEEE Access

View full text Add to dashboard Cite

Current work on named entities for Chinese electronic medical records requires training a separate model for each different type of electronic medical record, the performance of which depends on the amount of training data available for each dataset. However, different types of electronic medical records share similar semantic information with each other, while current models do not take full advantage of this potentially common knowledge. To overcome the mentioned problem, we propose a multi-task learning framework to transfer multiple types of electronic medical records through a shared encoder. Experiments demonstrate that our model achieves substantially better performance compared with the single-task model based on BERT. F1 scores improved by more than 1% on average across the four datasets, with individual datasets improving precision by more than 3.5%. Further analysis shows that our model still achieves better F1 scores on long tail datasets and small size datasets.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Fang Han

Multifeature Fusion Keyword Extraction Algorithm Based on TextRank

Named Entity Recognition for Chinese Electronic Medical Records Based on Multitask and Transfer Learning

Contact Info

Product

Resources

About