A reliable clustering algorithm for taskoriented dialogues can help developer analysis and define dialogue tasks efficiently. It is challenging to directly apply prior normal text clustering algorithms for task-oriented dialogues, due to the inherent differences between them, such as coreference, omission and diversity expression. In this paper, we propose a Dialogue Task Clustering Network (DTCN) model for task-oriented clustering. The proposed model combines context-aware utterance representations and cross-dialogue utterance cluster representations for task-oriented dialogues clustering. An iterative end-to-end training strategy is utilized for dialogue clustering and representation learning jointly. Experiments on three public datasets show that our model significantly outperformed strong baselines in all metrics 1 .
Dialogue management plays a vital role in task-oriented dialogue systems, which has become an active area of research in recent years. Despite the promising results brought from deep reinforcement learning, most of the studies need to develop a manual user simulator additionally. To address the time-consuming development of simulator policy, we propose a multi-agent dialogue model where an end-to-end dialogue manager and a user simulator are optimized simultaneously. Different from prior work, we optimize the two-agents from scratch and apply the reward shaping technology based on adjacency pairs constraints in conversational analysis to speed up learning and to avoid the derivation from normal human-human conversation. In addition, we generalize the one-to-one learning strategy to one-to-many learning strategy, where a dialogue manager can be concurrently optimized with various user simulators, to improve the performance of trained dialogue manager. The experimental results show that one-to-one agents trained with adjacency pairs constraints can converge faster and avoid derivation. In cross-model evaluation with human users involved, the dialogue manager trained in one-to-many strategy achieves the best performance. from scratch without supervised initializing process. For user simulator reward function, we use the reward shaping technique [11] based on the adjacency pairs in conversational analysis [12] to make the simulator learn real user behaviors quickly. In addition, we generalize the one-to-one learning strategy to one-to-many learning strategy where a dialogue manager cooperates with various user simulators to improve the performance of trained dialogue manager. We obtain these various user simulators through changing the adjacency pairs settings, and then we mixture them with a dialogue manager to optimize the cooperative policies via multi-agent reinforcement learning.Compared with MADM without the constraints, MADM trained with adjacency pairs constraints can converge faster and avoid derivation from normal human-human conversation. The experimental results also show that the dialogue manager trained with one-to-many strategy achieves the best performance in cross-model evaluation with human users involved. To summary, our main contributions in this work are three-fold:
Cross-domain slot filling focuses on using labeled data from source domains to train a slot filling model for target domains. It is of great significance for transferring a dialogue system into new domains. Most of the existing work focused on building a cross-domain transfer model. From the perspective of slots themselves, this paper proposes a model-agnostic Slot Transferability Measure (STM) for evaluating the transferability from a source slot to a target slot, specifically, the degree that labeled data of the source slot is helpful to train the slot filling model for the target slot. We also give a STM-based method for a model to select helpful source slots and their labeled data for a given target slot. Experimental results on multiple existing models and datasets show that our method significantly outperforms state-ofthe-art baselines in cross-domain slot filling.
Slot filling and intent detection are two major tasks for spoken language understanding. In most existing work, these two tasks are built as joint models with multi-task learning with no consideration of prior linguistic knowledge. In this paper, we propose a novel joint model that applies a graph convolutional network over dependency trees to integrate the syntactic structure for learning slot filling and intent detection jointly. Experimental results show that our proposed model achieves state-of-the-art performance on two public benchmark datasets and outperforms existing work. At last, we apply the BERT model to further improve the performance on both slot filling and intent detection. * The work was done when the first author was an intern at Meituan Group. The first two authors contribute equally.
Dialog management plays an important role in the task-oriented dialog system. Most of the previous works divide dialog management into state tracker and action selector. The two parts are modeled separately and implemented in a pipelined way, which suffers from the problem of error accumulation, and the feedback signal from action selector cannot be propagated to state tracker and natural language understanding module. This paper proposes a word-based partially observable Markov decision processes' dialog management that integrates natural language understanding, state tracker, and action selector into an end-to-end architecture. Our proposed dialog management takes the words from user utterances as inputs and then produces optimal action as well as slot values of natural language understanding which are necessary for response generation. To this end, we propose a hybrid learning method, which integrates reinforcement learning and supervised learning, to optimize the action selector and slot filler jointly. In addition, we develop a high-return prioritized experience replay to speed up the convergence of the training process. The experimental results show that the proposed dialog management outperforms four strong baselines in a series of different dialog tasks. A human user's evaluation also shows the same results. The high-return prioritized experience replay accelerates the convergence effectively, especially in the scenario in which the proposed dialog management works on more complex tasks.INDEX TERMS Recurrent neural networks, multi-layer neural network, supervised learning, reinforcement learning, dialog management, task-oriented dialog system, partially observable Markov decision processes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.