Neural Natural Language Processing for Unstructured Data in Electronic Health Records: a Review

Li, Irene; Pan, Jessica; Goldwasser, Jeremy; Verma, Neha; Wong, Wai Pan; Nuzumlalı, Muhammed Yavuz; Rosand, Benjamin; Li, Yixin; Zhang, Matthew; Chang, David C.; Taylor, Andrew; Krumholz, Harlan M.; Radev, Dragomir

doi:10.48550/arxiv.2107.02975

Cited by 6 publications

(6 citation statements)

References 193 publications

(248 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…NLP models are increasingly implemented on medical health care records for information extraction, representation learning and phenotyping 33 . We have compared multiple approaches, including state of the art T5 and Google BERT transformer models, in which Google PubMedBERT showed the highest model performance.…”

Section: Discussionmentioning

confidence: 99%

Natural language processing and modeling of clinical disease trajectories across brain disorders

Mekkes

Groot

Wehrens

et al. 2022

Preprint

View full text Add to dashboard Cite

Brain disorders, including neurodegenerative diseases, and mental illnesses, are often difficult to diagnose and study due to clinical and pathological heterogeneity, overlap in clinical manifestations between disorders, and frequent comorbidities, tampering drug development and fundamental research. Hence, there is a clear need for data-driven approaches to disentangle these complex disorders. Here, we established a computational pipeline to process clinical summaries from donors with a wide range of brain disorders that were neuro-pathologically diagnosed by the Netherlands Brain Bank. First, we identified and defined 90 cross-disorder signs and symptoms within cognitive, motor, sensory, psychiatric, and general domains. Second, we trained and optimized natural language processing (NLP) models to identify these signs and symptoms in individual sentences of the extensive clinical summaries from donors of the NBB, resulting in temporal disease trajectories. Third, we studied the temporal manifestation and survival profiles across rare and complex dementias, alpha-synucleinopathies, frontotemporal dementia subtypes, and mental illnesses. Lastly, we trained a recurrent neural network to predict the Neuropathological Diagnosis. Taken together, this integrated approach resulted in a highly unique resource that can facilitate research into cross-disorder symptomatology.

show abstract

Section: Discussionmentioning

confidence: 99%

Natural language processing and modeling of clinical disease trajectories across brain disorders

Mekkes

Groot

Wehrens

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…authors have reviewed the possibility of automating a number of tasks including disease diagnoses, disease prediction, phenotype modelling, disease classification, and developing training representations of medical concepts, such as diseases and medications. Li et al [8] focused on a broad scope of tasks, such as classification and prediction, word embedding, extraction, generation, and similar matters such as question answering, phenotyping, generating knowledge graphs, forming medical dialogue, and supporting multilingual communication and interpretability. They reviewed multiple recent studies that showed how such tasks could be supported by electronic health records and health informatics, concluding that Deep learning methods in the general field of NLP have achieved remarkable success, but that applying them to the field of biomedicine remains challenging due to limited data availability and additional difficulties associated with domain-specific text data.…”

Section: Related Workmentioning

confidence: 99%

Classification of specialities in textual medical reports based on natural language processing and feature selection

Almuhana

Abbas

2022

IJEECS

View full text Add to dashboard Cite

Nowadays, a great deal of detailed information about patients, including disease status, medication history, and side effects, is collected in an electronic format; called an electronic medical record (EMR), and the data serves as a valuable resource for further analysis, diagnosis, and treatment. The huge q uantity of detailed patient information in these medical texts produces a huge challenge in terms of processing this data efficiently, however. Machine learning (ML) algorithms, artificial intelligence techniques, and natural language processing tools can have the potential effect of simplifying unstructured data, which could positively affect medical report analysis. Natural language processing (NLP) has recently made huge advances on a variety of tasks. In this paper, an automatic system was thus produced to classify specialist consultant interactions based on patients’ medical reports. NLP was used as a pre-processing step on a dataset formed of unstructured medical reports. Feature extraction and selection methods were used to convert the textual reports into sets of features and to extract the most effective features to increase classification accuracy and reduce execution time. Various classification methods were then applied (ML perceptron, logistic regression random forest (RF), and linear support vec tor classifier (LSVC)). The highest accuracy (99.39%) was achieved in ML-perceptron classification techniques .

show abstract

“…According to van der Lee et al [32], there are three families of datato-text generation methods: statistical machine translation [16,20,28,31], neural machine translation [5,8,15,17,19,24,25,30,40], and rule-based linguistic summarization [2,10,27]. Neural and statistical methods generally involve training models to automatically generate natural language summaries of data, while rule-based methods depend on the use of protoforms to model their summary output.…”

Section: Related Workmentioning

confidence: 99%

A Framework for Generating Summaries from Temporal Personal Health Data

Harris

Chen²,

Zaki

2021

ACM Trans. Comput. Healthcare

View full text Add to dashboard Cite

Although it has become easier for individuals to track their personal health data (e.g., heart rate, step count, and nutrient intake data), there is still a wide chasm between the collection of data and the generation of meaningful summaries to help users better understand what their data means to them. With an increased comprehension of their data, users will be able to act upon the newfound information and work toward striving closer to their health goals. We aim to bridge the gap between data collection and summary generation by mining the data for interesting behavioral findings that may provide hints about a user’s tendencies. Our focus is on improving the explainability of temporal personal health data via a set of informative summary templates, or “protoforms.” These protoforms span both evaluation-based summaries that help users evaluate their health goals and pattern-based summaries that explain their implicit behaviors. In addition to individual-level summaries, the protoforms we use are also designed for population-level summaries. We apply our approach to generate summaries (both univariate and multivariate) from real user health data and show that the summaries our system generates are both interesting and useful.

show abstract

Neural Natural Language Processing for Unstructured Data in Electronic Health Records: a Review

Cited by 6 publications

References 193 publications

Natural language processing and modeling of clinical disease trajectories across brain disorders

Natural language processing and modeling of clinical disease trajectories across brain disorders

Classification of specialities in textual medical reports based on natural language processing and feature selection

A Framework for Generating Summaries from Temporal Personal Health Data

Contact Info

Product

Resources

About