Nonstrict Hierarchical Reinforcement Learning for Interactive Systems and Robots

2017 International Joint Conference on Neural Networks (IJCNN)

Williamson

et al. 2017

Abstract-Standard deep reinforcement learning methods such as Deep Q-Networks (DQN) for multiple tasks (domains) face scalability problems due to large search spaces. This paper proposes a three-stage method for multi-domain dialogue policy learning-termed NDQN, and applies it to an informationseeking spoken dialogue system in the domains of restaurants and hotels. In this method, the first stage does multi-policy learning via a network of DQN agents; the second makes use of compact state representations by compressing raw inputs; and the third stage applies a pre-training phase for bootstraping the behaviour of agents in the network. Experimental results comparing DQN (baseline) versus NDQN (proposed) using simulations report that the proposed method exhibits better scalability and is promising for optimising the behaviour of multi-domain dialogue systems. An additional evaluation reports that the NDQN agents outperformed a K-Nearest Neighbour baseline in task success and dialogue length, yielding more efficient and successful dialogues.

Section: Resultsmentioning

confidence: 99%

Section: A Network Of Deep Q-network (Ndqn)mentioning

confidence: 99%

Scaling up deep reinforcement learning for multi-domain dialogue systems

2017 International Joint Conference on Neural Networks (IJCNN)

Williamson

et al. 2017

“…It remains to be demonstrated how far one can go with such an approach. Future work includes to (a) compare different model architectures, training parameters and reward functions; (b) extend or improve the abilities of the proposed dialogue system; (c) train deep learning agents in other (larger scale) domains [7,8,9]; (d) evaluate end-to-end systems with real users; (e) compare or combine different types of neural nets [10]; and (e) perform fast learning based on parallel computing. Table 1 Example dialogue using the policy from Fig.2, where states are numerical representations of the last system and noisy user inputs, actions are dialogue acts, and user resposes are in brackets…”

Section: Discussionmentioning

confidence: 99%

SimpleDS: A Simple Deep Reinforcement Learning Dialogue System

Lecture Notes in Electrical Engineering

2016

Self Cite

This paper presents SimpleDS, a simple and publicly available dialogue system trained with deep reinforcement learning. In contrast to previous reinforcement learning dialogue systems, this system avoids manual feature engineering by performing action selection directly from raw text of the last system and (noisy) user responses. Our initial results, in the restaurant domain, report that it is indeed possible to induce reasonable behaviours with such an approach that aims for higher levels of automation in dialogue control for intelligent interactive agents.

“…This function is known as a classifier when the labels are discrete and as a regressor when the labels are continuous. All articles in this special issue make use of classifiers to predict events during human-machine interactions [Ngo et al 2014;Benotti et al 2014;Keizer et al 2014;Cuayáhuitl et al 2014]. -In contrast to supervised learning that makes use of direct feedback, reinforcement learning makes use of indirect feedback typically based on numerical rewards given during the interaction, and the goal is to maximize them in the long run.…”

Section: Multimodal Interactive Learning Systems: What and Why?mentioning

confidence: 99%

Introduction to the Special Issue on Machine Learning for Multiple Modalities in Interactive Systems and Robots

ACM Trans. Interact. Intell. Syst.

Frommberger

Dethlefs

et al. 2014

Self Cite

This special issue highlights research articles that apply machine learning to robots and other systems that interact with users through more than one modality, such as speech, gestures, and vision. For example, a robot may coordinate its speech with its actions, taking into account (audio-) visual feedback during their execution. Machine learning provides interactive systems with opportunities to improve performance not only of individual components but also of the system as a whole. However, machine learning methods that encompass multiple modalities of an interactive system are still relatively hard to find. The articles in this special issue represent examples that contribute to filling this gap.