Clinical Natural Language Processing in 2014: Foundational Methods Supporting Efficient Healthcare

Névéol, Aurélie; Zweigenbaum, Pierre

doi:10.15265/iy-2015-035

Cited by 27 publications

(18 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, between 2007 and 2018, the number of PubMed records with "free text" or "unstructured text" more than tripled [2]. Advances in natural language processing and machine learning, and access to de-identified clinical datasets, have contributed to this increase [3].…”

Section: Introductionmentioning

confidence: 99%

Customization scenarios for de-identification of clinical notes

Hartman

Howell

Dean

et al. 2020

BMC Med Inform Decis Mak

View full text Add to dashboard Cite

Background: Automated machine-learning systems are able to de-identify electronic medical records, including free-text clinical notes. Use of such systems would greatly boost the amount of data available to researchers, yet their deployment has been limited due to uncertainty about their performance when applied to new datasets. Objective: We present practical options for clinical note de-identification, assessing performance of machine learning systems ranging from off-the-shelf to fully customized. Methods: We implement a state-of-the-art machine learning de-identification system, training and testing on pairs of datasets that match the deployment scenarios. We use clinical notes from two i2b2 competition corpora, the Physionet Gold Standard corpus, and parts of the MIMIC-III dataset. Results: Fully customized systems remove 97-99% of personally identifying information. Performance of off-the-shelf systems varies by dataset, with performance mostly above 90%. Providing a small labeled dataset or large unlabeled dataset allows for fine-tuning that improves performance over off-the-shelf systems. Conclusion: Health organizations should be aware of the levels of customization available when selecting a deidentification deployment solution, in order to choose the one that best matches their resources and target performance level.

show abstract

Section: Introductionmentioning

confidence: 99%

Customization scenarios for de-identification of clinical notes

Hartman

Howell

Dean

et al. 2020

BMC Med Inform Decis Mak

View full text Add to dashboard Cite

show abstract

“…The review by Meystre et al [5] presents an excellent overview of health-related text processing and its applications until 2007. The research presented in the present manuscript includes, for social media and clinical records, advances to fundamental NLP methods such as classification, concept extraction, and normalization published since 2008, mostly omitting what has been included in similar recent reviews [6][7][8], except when required to complete and highlight recent advances. For each data source, after reviewing advances in fundamental methods, we reviewed specific applications that capture the patient's perspective for specific conditions, treatments, or phenotypes.…”

Section: Introductionmentioning

confidence: 99%

“…For the applications, we built on the 2016 review by Demner-Fushman and Elhadad [7] and highlighted major achievements from 2013 to 2016 that are relevant to the patient's perspective focus of this review. The search and selection criteria used were similar to the ones used by Névéol and Zweigenbaum [8], from January 1st, 2013 through December 31st, 2016, resulting in 464 papers. A total of 62 papers focusing on clinical records were selected from this set.…”

Section: Introductionmentioning

confidence: 99%

Capturing the Patient’s Perspective: a Review of Advances in Natural Language Processing of Health-Related Text

Gonzalez-Hernandez¹,

Sarker²,

O’Connor³

et al. 2017

Yearb Med Inform

View full text Add to dashboard Cite

SummaryBackground: Natural Language Processing (NLP) methods are increasingly being utilized to mine knowledge from unstructured health-related texts. Recent advances in noisy text processing techniques are enabling researchers and medical domain experts to go beyond the information encapsulated in published texts (e.g., clinical trials and systematic reviews) and structured questionnaires, and obtain perspectives from other unstructured sources such as Electronic Health Records (EHRs) and social media posts. Objectives: To review the recently published literature discussing the application of NLP techniques for mining health-related information from EHRs and social media posts. Methods: Literature review included the research published over the last five years based on searches of PubMed, conference proceedings, and the ACM Digital Library, as well as on relevant publications referenced in papers. We particularly focused on the techniques employed on EHRs and social media data. Results: A set of 62 studies involving EHRs and 87 studies involving social media matched our criteria and were included in this paper. We present the purposes of these studies, outline the key NLP contributions, and discuss the general trends observed in the field, the current state of research, and important outstanding problems.

show abstract

“…We omit discussing basic research recently reviewed by Névéol and Zweigenbaum [3] that is however still needed and ongoing. Some examples include exciting new approaches proposed in the context of community challenges: 2012 i2b2 event and time extraction [4], 2014 i2b2/UTHealth modeling of risk factors for heart disease [5], ShARe SemEval 2014 recognition and normalization of disorders [6,7], and ShARe SemEval 2015 disorder and template filling shared tasks [8].…”

Section: Introductionmentioning

confidence: 99%

“…Since Névéol and Zweigenbaum provide information about methods [3], we only mention here that the methods in the included papers range from regular expressions that dominate research in social-media text processing, to event extraction in a supervised setting. More recently, there has been more and more progress in incorporating the principles of distributional semantics into an NLP pipeline, and a shift towards more semantic parsing, however more work in these areas is needed.…”

Section: Introductionmentioning

confidence: 99%

Aspiring to Unintended Consequences of Natural Language Processing: A Review of Recent Developments in Clinical and Consumer-Generated Text Processing

Demner‐Fushman

Elhadad²

2016

Yearb Med Inform

View full text Add to dashboard Cite

SummaryObjectives: This paper reviews work over the past two years in Natural Language Processing (NLP) applied to clinical and consumer-generated texts. Methods: We included any application or methodological publication that leverages text to facilitate healthcare and address the health-related needs of consumers and populations. Results: Many important developments in clinical text processing, both foundational and task-oriented, were addressed in community-wide evaluations and discussed in corresponding special issues that are referenced in this review. These focused issues and in-depth reviews of several other active research areas, such as pharmacovigilance and summarization, allowed us to discuss in greater depth disease modeling and predictive analytics using clinical texts, and text analysis in social media for healthcare quality assessment, trends towards online interventions based on rapid analysis of health-related posts, and consumer health question answering, among other issues. Conclusions: Our analysis shows that although clinical NLP continues to advance towards practical applications and more NLP methods are used in large-scale live health information applications, more needs to be done to make NLP use in clinical applications a routine widespread reality. Progress in clinical NLP is mirrored by developments in social media text analysis: the research is moving from capturing trends to addressing individual health-related posts, thus showing potential to become a tool for precision medicine and a valuable addition to the standard healthcare quality evaluation tools.

show abstract

Clinical Natural Language Processing in 2014: Foundational Methods Supporting Efficient Healthcare

Cited by 27 publications

References 33 publications

Customization scenarios for de-identification of clinical notes

Customization scenarios for de-identification of clinical notes

Capturing the Patient’s Perspective: a Review of Advances in Natural Language Processing of Health-Related Text

Aspiring to Unintended Consequences of Natural Language Processing: A Review of Recent Developments in Clinical and Consumer-Generated Text Processing

Contact Info

Product

Resources

About