Tak-Sung Heo scite author profile

Survival analyses for malignancies, including renal cell carcinoma (RCC), have primarily been conducted using the Cox proportional hazards (CPH) model. We compared the random survival forest (RSF) and DeepSurv models with the CPH model to predict recurrence-free survival (RFS) and cancer-specific survival (CSS) in non-metastatic clear cell RCC (nm-cRCC) patients. Our cohort included 2139 nm-cRCC patients who underwent curative-intent surgery at six Korean institutions between 2000 and 2014. The data of two largest hospitals’ patients were assigned into the training and validation dataset, and the data of the remaining hospitals were assigned into the external validation dataset. The performance of the RSF and DeepSurv models was compared with that of CPH using Harrel’s C-index. During the follow-up, recurrence and cancer-specific deaths were recorded in 190 (12.7%) and 108 (7.0%) patients, respectively, in the training-dataset. Harrel’s C-indices for RFS in the test-dataset were 0.794, 0.789, and 0.802 for CPH, RSF, and DeepSurv, respectively. Harrel’s C-indices for CSS in the test-dataset were 0.831, 0.790, and 0.834 for CPH, RSF, and DeepSurv, respectively. In predicting RFS and CSS in nm-cRCC patients, the performance of DeepSurv was superior to that of CPH and RSF. In no distant time, deep learning-based survival predictions may be useful in RCC patients.

show abstract

Prediction of Stroke Outcome Using Natural Language Processing-Based Machine Learning of Radiology Report of Brain MRI

Heo

Kim

Choi

et al. 2020

JPM

View full text Add to dashboard Cite

Brain magnetic resonance imaging (MRI) is useful for predicting the outcome of patients with acute ischemic stroke (AIS). Although deep learning (DL) using brain MRI with certain image biomarkers has shown satisfactory results in predicting poor outcomes, no study has assessed the usefulness of natural language processing (NLP)-based machine learning (ML) algorithms using brain MRI free-text reports of AIS patients. Therefore, we aimed to assess whether NLP-based ML algorithms using brain MRI text reports could predict poor outcomes in AIS patients. This study included only English text reports of brain MRIs examined during admission of AIS patients. Poor outcome was defined as a modified Rankin Scale score of 3–6, and the data were captured by trained nurses and physicians. We only included MRI text report of the first MRI scan during the admission. The text dataset was randomly divided into a training and test dataset with a 7:3 ratio. Text was vectorized to word, sentence, and document levels. In the word level approach, which did not consider the sequence of words, and the “bag-of-words” model was used to reflect the number of repetitions of text token. The “sent2vec” method was used in the sensation-level approach considering the sequence of words, and the word embedding was used in the document level approach. In addition to conventional ML algorithms, DL algorithms such as the convolutional neural network (CNN), long short-term memory, and multilayer perceptron were used to predict poor outcomes using 5-fold cross-validation and grid search techniques. The performance of each ML classifier was compared with the area under the receiver operating characteristic (AUROC) curve. Among 1840 subjects with AIS, 645 patients (35.1%) had a poor outcome 3 months after the stroke onset. Random forest was the best classifier (0.782 of AUROC) using a word-level approach. Overall, the document-level approach exhibited better performance than did the word- or sentence-level approaches. Among all the ML classifiers, the multi-CNN algorithm demonstrated the best classification performance (0.805), followed by the CNN (0.799) algorithm. When predicting future clinical outcomes using NLP-based ML of radiology free-text reports of brain MRI, DL algorithms showed superior performance over the other ML algorithms. In particular, the prediction of poor outcomes in document-level NLP DL was improved more by multi-CNN and CNN than by recurrent neural network-based algorithms. NLP-based DL algorithms can be used as an important digital marker for unstructured electronic health record data DL prediction.

show abstract

Global and Local Information Adjustment for Semantic Similarity Evaluation

et al. 2021

View full text Add to dashboard Cite

Semantic similarity evaluation is used in various fields such as question-and-answering and plagiarism testing, and many studies have been conducted into this problem. In previous studies using neural networks to evaluate semantic similarity, similarity has been measured using global information of sentence pairs. However, since sentences do not only have one meaning but a variety of meanings, using only global information can have a negative effect on performance improvement. Therefore, in this study, we propose a model that uses global information and local information simultaneously to evaluate the semantic similarity of sentence pairs. The proposed model can adjust whether to focus more on global information or local information through a weight parameter. As a result of the experiment, the proposed model can show that the accuracy is higher than existing models that use only global information.

show abstract

A Novel Hybrid Methodology of Measuring Sentence Similarity

Yoo¹,

Heo²,

Park³

et al. 2021

Symmetry

View full text Add to dashboard Cite

The problem of measuring sentence similarity is an essential issue in the natural language processing area. It is necessary to measure the similarity between sentences accurately. Sentence similarity measuring is the task of finding semantic symmetry between two sentences, regardless of word order and context of the words. There are many approaches to measuring sentence similarity. Deep learning methodology shows a state-of-the-art performance in many natural language processing fields and is used a lot in sentence similarity measurement methods. However, in the natural language processing field, considering the structure of the sentence or the word structure that makes up the sentence is also important. In this study, we propose a methodology combined with both deep learning methodology and a method considering lexical relationships. Our evaluation metric is the Pearson correlation coefficient and Spearman correlation coefficient. As a result, the proposed method outperforms the current approaches on a KorSTS standard benchmark Korean dataset. Moreover, it performs a maximum of a 65% increase than only using deep learning methodology. Experiments show that our proposed method generally results in better performance than those with only a deep learning model.

show abstract

Sentence similarity evaluation using Sent2Vec and siamese neural network with parallel structure

Heo

Kim

Park

et al. 2021

IFS

View full text Add to dashboard Cite

Sentence similarity evaluation is a significant task used in machine translation, classification, and information extraction in the field of natural language processing. When two sentences are given, an accurate judgment should be made whether the meaning of the sentences is equivalent even if the words and contexts of the sentences are different. To this end, existing studies have measured the similarity of sentences by focusing on the analysis of words, morphemes, and letters. To measure sentence similarity, this study uses Sent2Vec, a sentence embedding, as well as morpheme word embedding. Vectors representing words are input to the 1-dimension convolutional neural network (1D-CNN) with various sizes of kernels and bidirectional long short-term memory (Bi-LSTM). Self-attention is applied to the features transformed through Bi-LSTM. Subsequently, vectors undergoing 1D-CNN and self-attention are converted through global max pooling and global average pooling to extract specific values, respectively. The vectors generated through the above process are concatenated to the vector generated through Sent2Vec and are represented as a single vector. The vector is input to softmax layer, and finally, the similarity between the two sentences is determined. The proposed model can improve the accuracy by up to 5.42% point compared with the conventional sentence similarity estimation models.

show abstract

Various Approaches for Predicting Stroke Prognosis using Magnetic Resonance Imaging Text Records

Heo

Kim

Choi

et al. 2020

View full text Add to dashboard Cite

Stroke is one of the leading causes of death and disability worldwide. Stroke is treatable, but it is prone to disability after treatment. To grasp the degree of disability caused by stroke, we use magnetic resonance imaging text records to predict stroke and measure the performance according to the document-level and sentence-level representation. As a result of the experiment, the document-level representation shows better performance.

show abstract

Prediction of Atrial Fibrillation Cases: Convolutional Neural Networks Using the Output Texts of Electrocardiography

Heo¹,

Kim²,

Kim³

et al. 2021

View full text Add to dashboard Cite

Atrial fibrillation (AF) is the most common arrhythmia. Since AF can cause strokes if it lasts for a long time, it is important to detect AF in advance and receive treatment. Electrocardiography is usually used for AF diagnosis. Electrocardiography records the electrical activity of the patient's heart to obtain an electrocardiogram (ECG), which usually consists of waves and a commentary on them. The onset of AF occurrence or its likelihood is judged by a comprehensive analysis of an ECG, which requires considerable prior knowledge and clinical experience. In this study, to make this process simpler, the output text of ECGs is analyzed by deep learning to predict the possibility of future AF. The proposed model represents words as vectors using FastText and extracts features using one-dimensional convolutional neural networks (CNNs). The model also combines features using global average pooling (GAP) and is trained to calculate the probability of developing AF. In an experiment, the model showed 85.03% accuracy in predicting the presence or absence of AF. We thus demonstrated the possibility of predicting the occurrence of AF in advance using only text analysis without prior knowledge and clinical experience of AF.

show abstract

DAGAM: Data Augmentation with Generation And Modification

Jo¹,

Heo²,

Park³

et al. 2022

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tak-Sung Heo

Deep learning based prediction of prognosis in nonmetastatic clear cell renal cell carcinoma

Prediction of Stroke Outcome Using Natural Language Processing-Based Machine Learning of Radiology Report of Brain MRI

Global and Local Information Adjustment for Semantic Similarity Evaluation

A Novel Hybrid Methodology of Measuring Sentence Similarity

Sentence similarity evaluation using Sent2Vec and siamese neural network with parallel structure

Various Approaches for Predicting Stroke Prognosis using Magnetic Resonance Imaging Text Records

Prediction of Atrial Fibrillation Cases: Convolutional Neural Networks Using the Output Texts of Electrocardiography

DAGAM: Data Augmentation with Generation And Modification

Contact Info

Product

Resources

About