Data Augmentation and Preparation Process of PerInfEx: A Persian Chatbot With the Ability of Information Extraction
Pegah Safari,
Mehrnoush Shamsfard
Abstract:In this paper, we describe data preparation for our proposed chatbot PerInfEx (Persian Information Extraction chatbot). It aims to interactively chit-chat with users in Persian and by asking the least number of direct questions, extract as much personal information as possible such as user's age or occupation. Collecting data in considerable size and aligned with our system's specifics is a crucial step to train data-hungry modules of Natural Language Understating (NLU) and Natural Language Generating (NLG). I… Show more
Set email alert for when this publication receives citations?
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.