2014 IEEE International Conference on Big Data (Big Data) 2014
DOI: 10.1109/bigdata.2014.7004390
|View full text |Cite
|
Sign up to set email alerts
|

A general supervised approach to segmentation of clinical texts

Abstract: Segmentation of clinical texts is critical for all sorts of tasks such as medical coding for billing, auto drafting of discharge summaries, patient problem list generation and many such applications. While there have been previous studies on using supervised approaches to segmentation of clinical texts, these existing approaches were trained and tested on a fairly limited data set showing low adaptibility to new unseen documents. We propose a highly generalized supervised model for segmenting clinical texts, b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(17 citation statements)
references
References 9 publications
0
17
0
Order By: Relevance
“…Finally, seven hybrid approaches use rule-based methods during the creation of training and test data sets, and then apply ML methods. This is the case of Apostolova et al [1], Sadoughi et al [46], Ni et al [40], Chen et al [5], Dai et al [7], Jancsary et al [20], and Ganesan and Subotin [16]. Other ones use rules for detecting the explicit sections and a ML algorithm for detecting implicit sections like dine in Cho et al [6].…”
Section: Resultsmentioning
confidence: 99%
“…Finally, seven hybrid approaches use rule-based methods during the creation of training and test data sets, and then apply ML methods. This is the case of Apostolova et al [1], Sadoughi et al [46], Ni et al [40], Chen et al [5], Dai et al [7], Jancsary et al [20], and Ganesan and Subotin [16]. Other ones use rules for detecting the explicit sections and a ML algorithm for detecting implicit sections like dine in Cho et al [6].…”
Section: Resultsmentioning
confidence: 99%
“…By contrast, studies and resources related to the recognition of EHR sections are still very limited. Ganesan and Subotin [ 12 ] proposed L1-regularized logistic regression model that is capable of recognizing the header, footer, and all of the top-level sections of a clinical note. Tepper et al [ 17 ] showed that the two-step approach which first recognized the section headings followed by their categorization achieved a better performance than the one that combines the two tasks in one step.…”
Section: Discussionmentioning
confidence: 99%
“…In view of this issue, this paper compiled a section-heading recognition corpus on top of the dataset released by the i2b2 2014 shared task [ 9 ] and presents a machine learning approach based on the conditional random fields (CRF) model [ 10 ] to handle the section-heading recognition task for EHRs. Based on the assumption that the narratives following a recognized section heading should belong to this corresponding section, this work modeled the task as a sequential token labeling problem in a given text, which differs from most of the previous works [ 11 , 12 ] that formulated the problem as a sentence-by-sentence classification task. The compiled corpus along with the developed model and section-heading recognition tool is publicly available at https://www.sites.google.com/site/hongjiedai/projects/nttmuclinicalnet and http://btm.tmu.edu.tw/nttmuclinicalnet/ in an attempt to facilitate clinical research.…”
Section: Introductionmentioning
confidence: 99%
“…Research may concentrate on section detection only (Ganesan and Subotin, 2014;Dai et al, 2015), section classification (with section boundaries assumed to be known) (Li et al, 2010;Haug et al, 2014) or both (Apostolova et al, 2009;Denny et al, 2009;Tepper et al, 2012). In this paper we focus on section-level classification and section classification at the sentence level.…”
Section: Related Workmentioning
confidence: 99%
“…In this paper we focus on section-level classification and section classification at the sentence level. Prior approaches to section prediction include Support Vector Machines leveraging features computed by bi-gram tf-idf vector representations (Apostolova et al, 2009), Hidden Markov Models (HMM) with sections regarded as part of a sequence (Li et al, 2010), Maximum Entropy Classifiers (Tepper et al, 2012), 1-Regularized Logistic Regression (Ganesan and Subotin, 2014), Bayesian models using N-gram features (Haug et al, 2014), and linear-chain Conditional Random Fields (CRF) to determine section headers (Dai et al, 2015). Most of these approaches rely heavily on hand-crafted features that are time consuming to develop and may not easily generalize across EHRs from different sources.…”
Section: Related Workmentioning
confidence: 99%