Joongwon Kim scite author profile

Joongwon Kim

5Publications

55Citation Statements Received

153Citation Statements Given

How they've been cited

How they cite others

116

152

Affiliations

University of Pennsylvania, Daewoo Shipbuilding and Marine Engineering (South Korea), Hankuk University of Foreign Studies

Publications

Order By: Most citations

Extracting seizure frequency from epilepsy clinic notes: a machine reading approach to natural language processing

Xie

Gallagher

Conrad

et al. 2022

View full text Add to dashboard Cite

Objective Seizure frequency and seizure freedom are among the most important outcome measures for patients with epilepsy. In this study, we aimed to automatically extract this clinical information from unstructured text in clinical notes. If successful, this could improve clinical decision-making in epilepsy patients and allow for rapid, large-scale retrospective research. Materials and Methods We developed a finetuning pipeline for pretrained neural models to classify patients as being seizure-free and to extract text containing their seizure frequency and date of last seizure from clinical notes. We annotated 1000 notes for use as training and testing data and determined how well 3 pretrained neural models, BERT, RoBERTa, and Bio_ClinicalBERT, could identify and extract the desired information after finetuning. Results The finetuned models (BERTFT, Bio_ClinicalBERTFT, and RoBERTaFT) achieved near-human performance when classifying patients as seizure free, with BERTFT and Bio_ClinicalBERTFT achieving accuracy scores over 80%. All 3 models also achieved human performance when extracting seizure frequency and date of last seizure, with overall F1 scores over 0.80. The best combination of models was Bio_ClinicalBERTFT for classification, and RoBERTaFT for text extraction. Most of the gains in performance due to finetuning required roughly 70 annotated notes. Discussion and Conclusion Our novel machine reading approach to extracting important clinical outcomes performed at or near human performance on several tasks. This approach opens new possibilities to support clinical practice and conduct large-scale retrospective clinical research. Future studies can use our finetuning pipeline with minimal training annotations to answer new clinical questions.

show abstract

Development of Parametric Trend Life Cycle Assessment for marine SOx reduction scrubber systems

Jang

Jeong

Zhou

et al. 2020

Journal of Cleaner Production

View full text Add to dashboard Cite

In response to the impending international maritime regulation, MARPOL Annex VI Reg. 14, to curb sulphur oxides (SOx) arising from shipping activities, this paper aimed to evaluate the environmental impacts of the entire life cycle of three different SOx reduction scrubber systems: (1) 'wet open-loop', (2) 'wet closed-loop', and (3) 'wet hybrid'. To achieve this goal, the paper developed 'the Parametric Trend Life Cycle Assessment (PT-LCA)' which was introduced to proceed the extensive analysis for a number of case ship studies and quantify various emissions, such as greenhouse gases (GHG), sulphur oxides (SOx), nitrogen oxides

show abstract

BiSECT: Learning to Split and Rephrase Sentences with Bitexts

Kim¹,

Maddela²,

Kriz³

et al. 2021

View full text Add to dashboard Cite

An important task in NLP applications such as sentence simplification is the ability to take a long, complex sentence and split it into shorter sentences, rephrasing as necessary. We introduce a novel dataset and a new model for this 'split and rephrase' task. Our BISECT training data consists of 1 million long English sentences paired with shorter, meaning-equivalent English sentences. We obtain these by extracting 1-2 sentence alignments in bilingual parallel corpora and then using machine translation to convert both sides of the corpus into the same language. BISECT contains higher quality training examples than previous Split and Rephrase corpora, with sentence splits that require more significant modifications. We categorize examples in our corpus, and use these categories in a novel model that allows us to target specific regions of the input sentence to be split and edited. Moreover, we show that models trained on BISECT can perform a wider variety of split operations and improve upon previous state-of-the-art approaches in automatic and human evaluations. 1

show abstract

Automated Diagnosis of Lung Cancer with the Use of Deep Convolutional Neural Networks on Chest CT

Kim

Lee

Yoon

2017

View full text Add to dashboard Cite

Induce, Edit, Retrieve: Language Grounded Multimodal Schema for Instructional Video Retrieval

Yang¹,

Kim²,

Panagopoulou³

et al. 2021

Preprint

View full text Add to dashboard Cite

Schemata are structured representations of complex tasks that can aid artificial intelligence by allowing models to break down complex tasks into intermediate steps. We propose a novel system that induces schemata from web videos and generalizes them to capture unseen tasks with the goal of improving video retrieval performance. Our system proceeds in three major phases: (1) Given a task with related videos, we construct an initial schema for a task using a joint videotext model to match video segments with text representing steps from wikiHow; (2) We generalize schemata to unseen tasks by leveraging language models to edit the text within existing schemata. Through generalization, we can allow our schemata to cover a more extensive range of tasks with a small amount of learning data; (3) We conduct zero-shot instructional video retrieval with the unseen task names as the queries. Our schema-guided approach outperforms existing methods for video retrieval, and we demonstrate that the schemata induced by our system are better than those generated by other models.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Joongwon Kim

Extracting seizure frequency from epilepsy clinic notes: a machine reading approach to natural language processing

Development of Parametric Trend Life Cycle Assessment for marine SOx reduction scrubber systems

BiSECT: Learning to Split and Rephrase Sentences with Bitexts

Automated Diagnosis of Lung Cancer with the Use of Deep Convolutional Neural Networks on Chest CT

Induce, Edit, Retrieve: Language Grounded Multimodal Schema for Instructional Video Retrieval

Contact Info

Product

Resources

About