Sangwoo Kang scite author profile

Machine translation refers to a fully automated process that translates a user's input text into a target language. To improve the accuracy of machine translation, studies usually exploit not only the input text itself but also various background knowledge related to the text, such as visual information or prior knowledge. Herein, in this paper, we propose a multimodal neural machine translation system that uses both texts and their related images to translate Korean image captions into English. The data in the experiment is a set of unlabeled images only containing bilingual captions. To train the system with a supervised learning approach, we propose a weak-labeling method that selects a keyword from an image caption using feature selection methods. The keywords are used to roughly determine an image label. We also introduce an improved feature selection method using sentence clustering to select keywords that reflect the characteristics of the image captions more accurately. We found that our multimodal system achieves an improved performance compared to a text-only neural machine translation system (baseline). Furthermore, the additional images have positive impacts on addressing the issue of under-translation, where some words in a source sentence are falsely translated or not translated at all. INDEX TERMS Human-computer interaction, multi-layer neural network, natural language processing, image classification, multimodal neural machine translation, weak label.

show abstract

Hierarchical speech-act classification for discourse analysis

Kang

Seo

2013

Pattern Recognition Letters

View full text Add to dashboard Cite

The analysis of a speech act is important for dialogue understanding systems because the speech act of an utterance is closely associated with the user's intention in the utterance. This paper proposes a speech act classification model that effectively uses a two-layer hierarchical structure generated from the adjacency pair information of speech acts. The proposed model has two advantages when adding hierarchical information to speech act classification; the improved accuracy of the speech act classification and the reduced running time in the testing phase. As a result, it achieves higher performance than other models that do not use the hierarchical structure and has faster running time because Support Vector Machine classifiers can efficiently be arranged on the two-layer hierarchical structure.

show abstract

Hybrid Sense Classification Method for Large-Scale Word Sense Disambiguation

2020

View full text Add to dashboard Cite

Word sense disambiguation (WSD) is a task of determining a reasonable sense of a word in a particular context. Although recent studies have demonstrated some progress in the advancement of neural language models, the scope of research is still such that the senses of several words can only be determined in a few domains. Therefore, it is necessary to move toward developing a highly scalable process that can address a lot of senses occurring in various domains. This paper introduces a new large WSD dataset that is automatically constructed from the Oxford Dictionary, which is widely used as a standard source for the meaning of words. We propose a new WSD model that individually determines the sense of the word in accordance with its part of speech in the context. In addition, we introduce a hybrid sense prediction method that separately classifies the less frequently used senses for achieving a reasonable performance. We have conducted comparative experiments to demonstrate that the proposed method is more reliable compared with the baseline approaches. Also, we investigated the adaptation of the method to a realistic environment with the use of news articles. INDEX TERMS Computational and artificial intelligence, English vocabulary learning, natural language processing, neural networks, word sense disambiguation. YOONSEOK HEO received the B.S. and M.S. degrees in computer science (major in in natural language generation) from Sogang Unversity. He is currently pursuing the Ph.D. degree with the Department of Computer Science, Sogang University. He worked as a Researcher with Gachon University, in 2018. He is interested in spoken dialogue system, machine translation, question answering, machine reading comprehension, and named entity recognition. His current research focuses on the way of exploiting multimodal resources for machine translation and addressing large-scale open domain texts for machine reading comprehension.

show abstract

A reliable multidomain model for speech act classification

Kang¹,

Kim²,

Seo³

2010

Pattern Recognition Letters

View full text Add to dashboard Cite

Common Sense-Based Reasoning Using External Knowledge for Question Answering

Yang

Kang

2020

IEEE Access

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sangwoo Kang

Multimodal Neural Machine Translation With Weakly Labeled Images

Hierarchical speech-act classification for discourse analysis

Hybrid Sense Classification Method for Large-Scale Word Sense Disambiguation

A reliable multidomain model for speech act classification

Common Sense-Based Reasoning Using External Knowledge for Question Answering

Contact Info

Product

Resources

About