In this paper, reinforcement learning (RL) is used to learn an efficient turn-taking management model in a simulated slotfilling task with the objective of minimising the dialogue duration and maximising the completion task ratio. Turn-taking decisions are handled in a separate new module, the Scheduler. Unlike most dialogue systems, a dialogue turn is split into microturns and the Scheduler makes a decision for each one of them. A Fitted Value Iteration algorithm, Fitted-Q, with a linear state representation is used for learning the state to action policy. Comparison between a non-incremental and an incremental handcrafted strategies, taken as baselines, and an incremental RL-based strategy, shows the latter to be significantly more efficient, especially in noisy environments.
The automatic prediction of the quality of a dialogue is useful to keep track of a spoken dialogue system's performance and, if necessary, adapt its behaviour. Classifiers and regression models have been suggested to make this prediction. The parameters of these models are learnt from a corpus of dialogues evaluated by users or experts. In this paper, we propose to model this task as an ordinal regression problem. We apply support vector machines for ordinal regression on a corpus of dialogues where each system-user exchange was given a rate on a scale of 1 to 5 by experts. Compared to previous models proposed in the literature, the ordinal regression predictor has significantly better results according to the following evaluation metrics: Cohen's agreement rate with experts ratings, Spearman's rank correlation coefficient, and Euclidean and Manhattan errors.
In this paper, a turn-taking phenomenon taxonomy is introduced, organised according to the level of information conveyed. It is aimed to provide a better grasp of the behaviours used by humans while talking to each other, so that they can be methodically replicated in spoken dialogue systems. Five interesting phenomena have been implemented in a simulated environment: the system barge-in with three variants (resulting from either an unclear, an incoherent or a sufficient user message), the feedback and the user barge-in. The experiments reported in the paper illustrate that how such phenomena are implemented is a delicate choice as their impact on the system's performance is variable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.