Background: A large collection of dialogues between patients and doctors are needed to be annotated for medical named entities to build intelligence for telemedicine. However, since most patients involved in telemedicine deliver related named entities in an informal and sentence-level multi-word expression way, it is challenging to tag them on the data of telemedicine dialogues. Under such circumstance, this study aims to address this issue.
Methods: On the data of telemedicine dialogues from Haodf, we have developed guidelines and followed two-round procedure to tag six types of named entities, including disease, symptom, time, pharmaceutical, operation, and examination. Moreover, we have experimented four deep-learning models on the dataset to establish a benchmark for named entity recognition.
Results: The distilled dataset contains 2,383 consultations between doctors and patients, 13,411 sentences from doctors, 17,929 from patients. The average characters per consultation is 1,100. There is 63,560 named entities on the whole, and average characters per named entity is 4.33.Moreover, the experiment results suggest that LatticeLSTM performs best on our dataset regarding all scores like accuracy, precision, F1, etc.
Conclusion: Compared with other exiting datasets, the novelties of this dataset are reflected in three facets: First, the intricated tagging of long multi-word expressions for medical named entity has been tackled in this study. Second, it is one of first attempts to mark temporal entities. Third, this dataset is balanced across the six types of labels. We believe that this dataset will play a considerable role in expanding telemedicine AI.