Extracting clinical information from free-text of pathology and operation notes via Chinese natural language processing

Zeng, Qiang; Zhang, Xiaoyan; Li, Zuofeng; Liu, Lei; Wei-de, Zhang

doi:10.1109/bibmw.2010.5703867

Cited by 6 publications

(3 citation statements)

References 13 publications

(8 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, different patterns of thinking and habits of Chinese expression offen cause a mass of difference in the flexibility of word order and parse for Chinese health questions [ 21 ]. Several studies on Chinese NLP focused on clinical named entity recognition [ 22 ], diseases, or drag-related clinical information extraction [ 23 , 24 ] and speculation detection [ 25 ] from the free-text of pathology and operation notes. The main challenges in these tasks were word segmentation and feature representation and selection.…”

Section: Introductionmentioning

confidence: 99%

Classifying Chinese Questions Related to Health Care Posted by Consumers Via the Internet

Guo¹,

Xu²,

Hou³

et al. 2017

J Med Internet Res

View full text Add to dashboard Cite

BackgroundIn question answering (QA) system development, question classification is crucial for identifying information needs and improving the accuracy of returned answers. Although the questions are domain-specific, they are asked by non-professionals, making the question classification task more challenging.ObjectiveThis study aimed to classify health care–related questions posted by the general public (Chinese speakers) on the Internet.MethodsA topic-based classification schema for health-related questions was built by manually annotating randomly selected questions. The Kappa statistic was used to measure the interrater reliability of multiple annotation results. Using the above corpus, we developed a machine-learning method to automatically classify these questions into one of the following six classes: Condition Management, Healthy Lifestyle, Diagnosis, Health Provider Choice, Treatment, and Epidemiology.ResultsThe consumer health question schema was developed with a four-hierarchical-level of specificity, comprising 48 quaternary categories and 35 annotation rules. The 2000 sample questions were coded with 2000 major codes and 607 minor codes. Using natural language processing techniques, we expressed the Chinese questions as a set of lexical, grammatical, and semantic features. Furthermore, the effective features were selected to improve the question classification performance. From the 6-category classification results, we achieved an average precision of 91.41%, recall of 89.62%, and F1 score of 90.24%.ConclusionsIn this study, we developed an automatic method to classify questions related to Chinese health care posted by the general public. It enables Artificial Intelligence (AI) agents to understand Internet users’ information needs on health care.

show abstract

Section: Introductionmentioning

confidence: 99%

Classifying Chinese Questions Related to Health Care Posted by Consumers Via the Internet

Guo¹,

Xu²,

Hou³

et al. 2017

J Med Internet Res

View full text Add to dashboard Cite

show abstract

“…In the clinical domain, various natural language processing(NLP) systems for Chinese clinical text have been created, such as named entity recognition [ 24 ], clinical information extraction [ 26 , 27 ], and speculation detection [ 28 ]. The main challenges in these tasks include word segmentation and feature representation and selection.…”

Section: Introductionmentioning

confidence: 99%

Mining and standardizing chinese consumer health terms

Hou

Kang

Yan

et al. 2018

BMC Med Inform Decis Mak

View full text Add to dashboard Cite

BackgroundHealth professionals and consumers use different terms to express medical events or concerns, which makes the communication barriers between the professionals and consumers. This may lead to bias in the diagnosis or treatment due to the misunderstanding or incomplete understanding. To solve the issue, a consumer health vocabulary was developed to map the consumer-used health terms to professional-used medical terms.MethodsIn this study, we extracted Chinese consumer health terms from both online health forum and patient education monographs, and manually mapped them to medical terms used by professionals (terms in medical thesauri or in medical books). To ensure the above annotation quality, we developed annotation guidelines.ResultsWe applied our method to extract consumer-used disease terms in endocrinology, cardiology, gastroenterology and dermatology. In this study, we identified 1349 medical mentions from 8436 questions posted in an online health forum and 1428 articles for patient education monographs. After manual annotation and review, we released 1036 Chinese consumer health terms with mapping to 480 medical terms. Four annotators worked on the manual annotation work following the Chinese consumer health term annotation guidelines. Their average inter-annotator agreement (IAA) score was 93.91% ensuring high consistency of the released terms.ConclusionsWe extracted Chinese consumer health terms from online forum and patient education monographs, and mapped them to medical terms used by professionals. Manual annotation efforts have been made for term annotating and mapping. Our study may contribute to the Chinese consumer health vocabulary construction. In addition, our annotated corpus, both the contexts of consumer health terms and consumer-professional term mapping, would be a useful resource for automatic methodology development. The dataset of the Chinese consumer health terms (CHT) is publicly available at http://www.phoc.org.cn/cht/.

show abstract

“…Many text mining applications require processing casual text data, which often are in semistructured or unstructured text, such as clinical document analysis [3,5], emails, instant messages, free-text of medical records, operational notes, emails, instant messages, etc., and the application of this research is in automotive diagnostic text mining.…”

Section: Introductionmentioning

confidence: 99%

Vehicle Fault Diagnostics Using Text Mining, Vehicle Engineering Structure and Machine Learning

Murphey

Huang²,

Wang³

et al. 2015

IJIIS

View full text Add to dashboard Cite

This paper presents an intelligent vehicle fault diagnostics system, SeaProSel(Search-Prompt-Select). SeaProSel takes a casual description of vehicle problems as input and searches for a diagnostic code that accurately matches the problem description. SeaProSel was developed using automatic text classification and machine learning techniques combined with a prompt-and-select technique based on the vehicle diagnostic engineering structure to provide robust classification of the diagnostic code that accurately matches the problem description. Machine learning algorithms are developed to automatically learn words and terms, and their variations commonly used in verbal descriptions of vehicle problems, and to build a TCW(Term-Code-Weight) matrix that is used for measuring similarity between a document vector and a diagnostic code class vector. When no exactly matched diagnostic code is found based on the direct search using the TCW matrix, the SeaProSel system will search the vehicle fault diagnostic structure for the proper questions to pose to the user in order to obtain more details about the problem. A LSI (Latent Semantic Indexing) model is also presented and analyzed in the paper. The performances of the LSI model and TCW models are presented and discussed. An in-depth study of different term weight functions and their performances are presented. All experiments are conducted on real-world vehicle diagnostic data, and the results show that the proposed SeaProSel system generates accurate results efficiently for vehicle fault diagnostics.

show abstract

Extracting clinical information from free-text of pathology and operation notes via Chinese natural language processing

Cited by 6 publications

References 13 publications

Classifying Chinese Questions Related to Health Care Posted by Consumers Via the Internet

Classifying Chinese Questions Related to Health Care Posted by Consumers Via the Internet

Mining and standardizing chinese consumer health terms

Vehicle Fault Diagnostics Using Text Mining, Vehicle Engineering Structure and Machine Learning

Contact Info

Product

Resources

About