Objective
Natural language processing (NLP) of symptoms from electronic health records (EHRs) could contribute to the advancement of symptom science. We aim to synthesize the literature on the use of NLP to process or analyze symptom information documented in EHR free-text narratives.
Materials and Methods
Our search of 1964 records from PubMed and EMBASE was narrowed to 27 eligible articles. Data related to the purpose, free-text corpus, patients, symptoms, NLP methodology, evaluation metrics, and quality indicators were extracted for each study.
Results
Symptom-related information was presented as a primary outcome in 14 studies. EHR narratives represented various inpatient and outpatient clinical specialties, with general, cardiology, and mental health occurring most frequently. Studies encompassed a wide variety of symptoms, including shortness of breath, pain, nausea, dizziness, disturbed sleep, constipation, and depressed mood. NLP approaches included previously developed NLP tools, classification methods, and manually curated rule-based processing. Only one-third (n = 9) of studies reported patient demographic characteristics.
Discussion
NLP is used to extract information from EHR free-text narratives written by a variety of healthcare providers on an expansive range of symptoms across diverse clinical specialties. The current focus of this field is on the development of methods to extract symptom information and the use of symptom information for disease classification tasks rather than the examination of symptoms themselves.
Conclusion
Future NLP studies should concentrate on the investigation of symptoms and symptom documentation in EHR free-text narratives. Efforts should be undertaken to examine patient characteristics and make symptom-related NLP algorithms or pipelines and vocabularies openly available.