A Novel System for Extractive Clinical Note Summarization using

Liang, Jennifer J.; Tsou, Ching-Huei; Poddar, Ananya

doi:10.18653/v1/w19-1906

Cited by 35 publications

(30 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…NLP for medical notes. The NLP community has worked extensively on medical notes to alleviate information overload, ranging from summarization (McInerney et al, 2020;Liang et al, 2019;Alsentzer and Kim, 2018) to information extraction (Wiegreffe et al, 2019;Zheng et al, 2014;Wang et al, 2018). For instance, information extraction aims to automatically extract valuable information from existing medical notes.…”

Section: Related Workmentioning

confidence: 99%

Characterizing the Value of Information in Medical Notes

Hsu

Karnwal

Mullainathan

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

Machine learning models depend on the quality of input data. As electronic health records are widely adopted, the amount of data in health care is growing, along with complaints about the quality of medical notes. We use two prediction tasks, readmission prediction and in-hospital mortality prediction, to characterize the value of information in medical notes. We show that as a whole, medical notes only provide additional predictive power over structured information in readmission prediction. We further propose a probing framework to select parts of notes that enable more accurate predictions than using all notes, despite that the selected information leads to a distribution shift from the training data ("all notes"). Finally, we demonstrate that models trained on the selected valuable information achieve even better predictive performance, with only 6.8% of all the tokens for readmission prediction.

show abstract

Section: Related Workmentioning

confidence: 99%

Characterizing the Value of Information in Medical Notes

Hsu

Karnwal

Mullainathan

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

View full text Add to dashboard Cite

show abstract

“…Similar investigations into latent EHR data have identified benefits to extracting cardiovascular data, 1 pulmonary function tests, 16 health maintenance history, immunizations, and other clinical data that may exist unstructured within patient notes. 17 In the current generation of commercial EHRs, this information does Searching the PDF Haystack Kostrinsky-Thomas et al 247 not necessarily trigger or satisfy health maintenance reminders, and unless it is manually read and entered, what is contained in these scanned records may not be reflected in the EHR past medical history, patient problem lists, or lists of allergies. As others have noted, the literature devoted to scanned documents and images within EHRs is smaller than we expected given the importance of this commonly used means for HIE in the early decades of EHR use in our country.…”

Section: Discussionmentioning

confidence: 99%

Searching the PDF Haystack: Automated Knowledge Discovery in Scanned EHR Documents

2021

View full text Add to dashboard Cite

Background Clinicians express concern that they may be unaware of important information contained in voluminous scanned and other outside documents contained in electronic health records (EHRs). An example is “unrecognized EHR risk factor information,” defined as risk factors for heritable cancer that exist within a patient's EHR but are not known by current treating providers. In a related study using manual EHR chart review, we found that half of the women whose EHR contained risk factor information meet criteria for further genetic risk evaluation for heritable forms of breast and ovarian cancer. They were not referred for genetic counseling. Objectives The purpose of this study was to compare the use of automated methods (optical character recognition with natural language processing) versus human review in their ability to identify risk factors for heritable breast and ovarian cancer within EHR scanned documents. Methods We evaluated the accuracy of the chart review by comparing our criterion standard (physician chart review) versus an automated method involving Amazon's Textract service (Amazon.com, Seattle, Washington, United States), a clinical language annotation modeling and processing toolkit (CLAMP) (Center for Computational Biomedicine at The University of Texas Health Science, Houston, Texas, United States), and a custom-written Java application. Results We found that automated methods identified most cancer risk factor information that would otherwise require clinician manual review and therefore is at risk of being missed. Conclusion The use of automated methods for identification of heritable risk factors within EHRs may provide an accurate yet rapid review of patients' past medical histories. These methods could be further strengthened via improved analysis of handwritten notes, tables, and colloquial phrases.

show abstract

“…Generating a medical summary from a clinician-patient conversation can be cast as a supervised learning task, 32 where an ML algorithm is trained with a large set of past medical conversation transcripts along with the gold standard summary associated with each conversation. 7,33 The input to the summarization model would be a clinician-patient transcript and the output would be an appropriate summary. 34,35 However, obtaining the gold standard summary of each conversation is costly because of the medical expertize required to complete the task 14 and the high variability in clinician notes' content, style, organization, and quality.…”

Section: Challenge 4: Conversation Summarizationmentioning

confidence: 99%

Challenges of developing a digital scribe to reduce clinical documentation burden

et al. 2019

View full text Add to dashboard Cite

Clinicians spend a large amount of time on clinical documentation of patient encounters, often impacting quality of care and clinician satisfaction, and causing physician burnout. Advances in artificial intelligence (AI) and machine learning (ML) open the possibility of automating clinical documentation with digital scribes, using speech recognition to eliminate manual documentation by clinicians or medical scribes. However, developing a digital scribe is fraught with problems due to the complex nature of clinical environments and clinical conversations. This paper identifies and discusses major challenges associated with developing automated speech-based documentation in clinical settings: recording high-quality audio, converting audio to transcripts using speech recognition, inducing topic structure from conversation data, extracting medical concepts, generating clinically meaningful summaries of conversations, and obtaining clinical data for AI and ML algorithms.npj Digital Medicine (2019) 2:114 ; https://doi.

show abstract

A Novel System for Extractive Clinical Note Summarization using

Cited by 35 publications

References 15 publications

Characterizing the Value of Information in Medical Notes

Characterizing the Value of Information in Medical Notes

Searching the PDF Haystack: Automated Knowledge Discovery in Scanned EHR Documents

Challenges of developing a digital scribe to reduce clinical documentation burden

Contact Info

Product

Resources

About