Jeon Gue Park scite author profile

This study focuses on a method for sequential data augmentation in order to alleviate data sparseness problems. Specifically, we present corpus expansion techniques for enhancing the coverage of a language model. Recent recurrent neural network studies show that a seq2seq model can be applied for addressing language generation issues; it has the ability to generate new sentences from given input sentences. We present a method of corpus expansion using a sentence-chain based seq2seq model. For training the seq2seq model, sentence chains are used as triples. The first two sentences in a triple are used for the encoder of the seq2seq model, while the last sentence becomes a target sequence for the decoder. Using only internal resources, evaluation results show an improvement of approximately 7.6% relative perplexity over a baseline language model of Korean text. Additionally, from a comparison with a previous study, the sentence chain approach reduces the size of the training data by 38.4% while generating 1.4-times the number of ngrams with superior performance for English text.

show abstract

Spoken English fluency scoring using convolutional neural networks

Chung

Lee

et al. 2017

View full text Add to dashboard Cite

Automatic proficiency assessment of Korean speech read aloud by non‐natives using bidirectional LSTM‐based speech recognition

Park

Jeon

et al. 2020

ETRI Journal

View full text Add to dashboard Cite

Owing to the rising demand for second-language learning and the advances in machine learning, there has been increase in the need for spoken computer-assisted language learning (CALL) applications [1,2]. Moreover, with the spread of Korean popular culture overseas [3], the need for Korean language learning has prompted the development of such CALL applications for non-native Korean learners. Among the spoken Korean CALL applications, this paper focuses on an automatic speech recognition (ASR)-based proficiency assessment for non-native Korean speech. Non-native speech significantly degrades the performance of the ASR used in a spoken CALL owing to the pronunciation variabilities in non-native speech [4,5]. Consequently, numerous research results have been reported on automatic proficiency assessment methods for non-native speech that is read aloud [6-13] and for spontaneous speech [14-17]. However, there has been limited research on proficiency assessment of non-native Korean speech [18]. Moreover, most research has been focused on the analysis of pronunciation variabilities in non-native Korean speech. For instance, [19,20] analyzes the pronunciation variabilities of Korean spoken by Japanese and Chinese learners using contrastive and

show abstract

Deep neural network using trainable activation functions

Chung

Lee

Park

2016

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jeon Gue Park

GenieTutor: A Computer-Assisted Second-Language Learning System Based on Spoken Language Understanding

Sentence-Chain Based Seq2seq Model for Corpus Expansion

Spoken English fluency scoring using convolutional neural networks

Automatic proficiency assessment of Korean speech read aloud by non‐natives using bidirectional LSTM‐based speech recognition

Deep neural network using trainable activation functions

Contact Info

Product

Resources

About