Kyoungman Bae scite author profile

Asso for Info Science & Tech

2019

To build an effective community question answering (cQA) service, determining ways to obtain questions similar to an input query question is a significant research issue. The major challenges for question retrieval in cQA are related to solving the lexical gap problem and estimating the relevance between questions. In this study, we first solve the lexical gap problem using a translation-based language model (TRLM). Thereafter, we determine features and methods that are competent for estimating the relevance between two questions. For this purpose, we explore ways to use the results of a dependency parser and question classification for category information. Head-dependent pairs are first extracted as bigram features, called dependency bigrams, from the analysis results of the dependency parser. The probability of each category is estimated using the softmax approach based on the scores of the classification results. Subsequently, we propose two retrieval models-the dependency-based model (DM) and category-based model (CM)-and they are applied to the previous model, TRLM. The experimental results demonstrate that the proposed methods significantly improve the performance of question retrieval in cQA services.

Effective Korean Speech-act Classification Using the Classification Priority Application and a Post-correction Rules

Song¹,

Bae²

2016

Journal of KIISE

An effective category classification method based on a language model for question category recommendation on a cQA service

2012

Classiying user's question into several topics helps respondents answering the question in a cQA service. The word weighting method must estimate the appropriate weight of a word to improve the category (or topic) classification. In this paper, we propose a novel effective word weighting method based on a language model for automatic category classification in the cQA service. We first calculate the occurrence probability of a word in each category by using a language model and then the final weight of each word is estimated by ratio of the occurrence probability of the word on a category to the occurrence probability of the word on the other categories. As a result, the proposed method significantly improves the performance of the category classification.

An Effective Question Expanding Method for Question Classification in cQA services

2014

This paper introduces a new question expanding method for question classification in cQA services. Input questions are mostly generated by a small size of text in the cQA services, and test inputs consist of only a question whereas training data do a pair of question and answer. Thus, the input questions cannot provide enough information for good classification in many cases. To solve this problem, we propose the question expanding method by pseudo relevant feedback and automatic answer generation. For pseudo relevant feedback, we first find relevant question-answer pairs related to an input question using the Indri search engine, and then top relevant words are chosen as expanded words. The automatic answer generation tries to create pseudo answers by adding question-related words using translation probabilities from questions to answers by Giza++. As a result, we obtain the significant improved performances when two approaches are effectively combined.

How to Combine Translation Probabilities and Question Expansion for Question Classification in cQA Services

IEICE Trans. Inf. & Syst.

2016

This paper claims to use a new question expansion method for question classification in cQA services. The input questions consist of only a question whereas training data do a pair of question and answer. Thus they cannot provide enough information for good classification in many cases. Since the answer is strongly associated with the input questions, we try to create a pseudo answer to expand each input question. Translation probabilities between questions and answers and a pseudo relevant feedback technique are used to generate the pseudo answer. As a result, we obtain the significant improved performances when two approaches are effectively combined. key words: question classification, cQA service, pseudo relevant feedback (PRF), question expansion, translation probability