2019
DOI: 10.1016/j.eswa.2018.09.034
|View full text |Cite
|
Sign up to set email alerts
|

Clinical text classification research trends: Systematic literature review and open issues

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
110
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 94 publications
(116 citation statements)
references
References 79 publications
0
110
0
Order By: Relevance
“…The data sources used in various research studies can be categorised into two types: homogeneous sources and heterogeneous sources, which can further be divided into three subtypes: binary class, multi-class single labeled, multi-class multilabeled datasets (Mujtaba et al, 2019). There are few datasets that are publicly available such as PhysioNet 1 , i2b2 NLP dataset 2 , and OHSUMED 3 .…”
Section: Datasets Availablementioning
confidence: 99%
See 3 more Smart Citations
“…The data sources used in various research studies can be categorised into two types: homogeneous sources and heterogeneous sources, which can further be divided into three subtypes: binary class, multi-class single labeled, multi-class multilabeled datasets (Mujtaba et al, 2019). There are few datasets that are publicly available such as PhysioNet 1 , i2b2 NLP dataset 2 , and OHSUMED 3 .…”
Section: Datasets Availablementioning
confidence: 99%
“…Preprocessing is done to remove meaningless information from the dataset as the clinical narratives may contain high level of noise, sparsity, mispelled words, grammatical errors (Nguyen and Patrick, 2016;Mujtaba et al, 2019). Different preprocessing techniques are applied in research studies including sentence splitting, tokenisation, spell error detection and correction, stemming and lemmatisation, normalisation (Manning et al, 2008), removal of stop words, removal of punctuation or special symbols, abbreviation expansion, chunking, named entity recognition (Bird et al, 2009), negation detection (Chapman et al, 2001).…”
Section: Preprocessingmentioning
confidence: 99%
See 2 more Smart Citations
“…With respect to automated text classification, in this work, we compared the approaches from the two main paradigms: (1) symbolic text classification, in which texts are represented with sparse vectors of TF-IDF weights, used as input features for traditional machine learning algorithms, such as Logistic Regression (LR) or Support Vector Machine (SVM); and (2) a more recent semantic text classification paradigm, in which dense semantic representations of words-word embeddings-are introduced as input to a neural architecture. Different deep learning architectures have been tried in a number of medical text classification tasks [25][26][27], including automated classification of radiology reports [6,28,29]. While recurrent [29,30] and attention-based neural networks [27,31] may present a viable solution, convolutional neural networks (CNN) seem to generally offer an edge in classification performance as well as faster training times [6,29].…”
Section: Introductionmentioning
confidence: 99%