A shared task involving multi-label classification of clinical free text

Pestian, John; Brew, Chris; Matykiewicz, Paweł; Hovermale, DJ; Johnson, Neil; Cohen, K. Bretonnel; Duch, Włodzisław

doi:10.3115/1572392.1572411

Cited by 247 publications

(161 citation statements)

References 9 publications

(6 reference statements)

Supporting

Mentioning

150

Contrasting

Unclassified

Order By: Relevance

“…This is in part because the use of patient records is subject to strict regulation. Thus, the corpus used for most auto-coding research up to date consists of about two thousand documents annotated with 45 ICD-9 codes (Pestian et al, 2007). It was used in a shared task at the 2007 BioNLP workshop and gave rise to papers studying a variety of rule-based and statistical methods, which are too numerous to list here.…”

Section: Related Workmentioning

confidence: 99%

A System for Predicting ICD-10-PCS Codes from Electronic Health Records

Subotin¹,

Davis²

2014

Proceedings of BioNLP 2014

View full text Add to dashboard Cite

Medical coding is a process of classifying health records according to standard code sets representing procedures and diagnoses. It is an integral part of health care in the U.S., and the high costs it incurs have prompted adoption of natural language processing techniques for automatic generation of these codes from the clinical narrative contained in electronic health records. The need for effective auto-coding methods becomes even greater with the impending adoption of ICD-10, a code inventory of greater complexity than the currently used code sets. This paper presents a system that predicts ICD-10 procedure codes from the clinical narrative using several levels of abstraction. First, partial hierarchical classification is used to identify potentially relevant concepts and codes. Then, for each of these concepts we estimate the confidence that it appears in a procedure code for that document. Finally, confidence values for the candidate codes are estimated using features derived from concept confidence scores. The concept models can be trained on data with ICD-9 codes to supplement sparse ICD-10 training resources. Evaluation on held-out data shows promising results.

show abstract

Section: Related Workmentioning

confidence: 99%

A System for Predicting ICD-10-PCS Codes from Electronic Health Records

Subotin¹,

Davis²

2014

Proceedings of BioNLP 2014

View full text Add to dashboard Cite

show abstract

“…These corpora are publicly available and are explained below. ICD9 dataset is an open challenge dataset published by the Computational Medicine Center in 2007 (Pestian et al, 2007). The dataset consists of clinical free text which is a set of 978 anonymized radiology reports and their corresponding ICD-9-CM codes.…”

Section: Datasetsmentioning

confidence: 99%

“…In 2007 Pestian et al (2007) organised a shared task which introduced a dataset of radiology reports to be autocoded with ICD9 codes. This multi-label classification task attracted a large body of research over the years-e.g., (Farkas and Szarvas, 2008;Suominen et al, 2008)-which tackled the problem with methods such as rule-based, decision trees, entropy and SVM classifiers.…”

Section: Related Workmentioning

confidence: 99%

Automatic Diagnosis Coding of Radiology Reports: A Comparison of Deep Learning and Conventional Classification Methods

Karimi¹,

Dai²,

Hassanzadeh³

et al. 2017

BioNLP 2017

View full text Add to dashboard Cite

Diagnosis autocoding is intended to both improve the productivity of clinical coders and the accuracy of the coding. We investigate the applicability of deep learning at autocoding of radiology reports using International Classification of Diseases (ICD). Deep learning methods are known to require large training data. Our goal is to explore how to use these methods when the training data is sparse, skewed and relatively small, and how their effectiveness compares to conventional methods. We identify optimal parameters for setting up a convolutional neural network for autocoding with comparable results to that of conventional methods.

show abstract

“…Data contained around 350,000 abstracts from the MED-LINE database over five years, manually created topics, and a topic set based on the standardised MeSH. 10 The Genomics Track [23] [28] and 2011 [29] addressed automated diagnosis coding of radiology reports and classifying the emotions found in suicide notes. In 2007, 1,954 de-identified radiology reports in English from a US radiology department for children were used.…”

Section: Introductionmentioning

confidence: 99%

Overview of the ShARe/CLEF eHealth Evaluation Lab 2013

Suominen

Salanterä

Velupillai

et al. 2013

Lecture Notes in Computer Science

185

171

View full text Add to dashboard Cite

Abstract. Discharge summaries and other free-text reports in healthcare transfer information between working shifts and geographic locations. Patients are likely to have difficulties in understanding their content, because of their medical jargon, non-standard abbreviations, and ward-specific idioms. This paper reports on an evaluation lab with an aim to support the continuum of care by developing methods and resources that make clinical reports in English easier to understand for patients, and which helps them in finding information related to their condition. This ShARe/CLEFeHealth2013 lab offered student mentoring and shared tasks: identification and normalisation of disorders (1a and 1b) and normalisation of abbreviations and acronyms (2) Overview of the ShARe/CLEF eHealth Evaluation Lab 2013 213 reports with respect to terminology standards in healthcare as well as information retrieval (3) to address questions patients may have when reading clinical reports. The focus on patients' information needs as opposed to the specialised information needs of physicians and other healthcare workers was the main feature of the lab distinguishing it from previous shared tasks. De-identified clinical reports for the three tasks were from US intensive care and originated from the MIMIC II database. Other text documents for Task 3 were from the Internet and originated from the Khresmoi project. Task 1 annotations originated from the ShARe annotations. For Tasks 2 and 3, new annotations, queries, and relevance assessments were created. 64, 56, and 55 people registered their interest in Tasks 1, 2, and 3, respectively. 34 unique teams (3 members per team on average) participated with 22, 17, 5, and 9 teams in Tasks 1a, 1b, 2 and 3, respectively. The teams were from Australia, China, France, India, Ireland, Republic of Korea, Spain, UK, and USA. Some teams developed and used additional annotations, but this strategy contributed to the system performance only in Task 2. The best systems had the F1 score of 0.75 in Task 1a; Accuracies of 0.59 and 0.72 in Tasks 1b and 2; and Precision at 10 of 0.52 in Task 3. The results demonstrate the substantial community interest and capabilities of these systems in making clinical reports easier to understand for patients. The organisers have made data and tools available for future research and development.

show abstract

A shared task involving multi-label classification of clinical free text

Cited by 247 publications

References 9 publications

A System for Predicting ICD-10-PCS Codes from Electronic Health Records

A System for Predicting ICD-10-PCS Codes from Electronic Health Records

Automatic Diagnosis Coding of Radiology Reports: A Comparison of Deep Learning and Conventional Classification Methods

Overview of the ShARe/CLEF eHealth Evaluation Lab 2013

Contact Info

Product

Resources

About