1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings
DOI: 10.1109/asru.1997.658989
|View full text |Cite
|
Sign up to set email alerts
|

Learning dialogue strategies within the Markov decision process framework

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
58
0

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 89 publications
(60 citation statements)
references
References 4 publications
0
58
0
Order By: Relevance
“…Humans have a greater propensity to criticize what is wrong than to provide positive proposals. In this context, Reinforcement Learning (RL) [1] appears as the best solution to the problem and have been first proposed in [2] and further developed in [3][4] [5]. The main differences between the approaches rely in the way they model the dialogue manager's environment during the learning process.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Humans have a greater propensity to criticize what is wrong than to provide positive proposals. In this context, Reinforcement Learning (RL) [1] appears as the best solution to the problem and have been first proposed in [2] and further developed in [3][4] [5]. The main differences between the approaches rely in the way they model the dialogue manager's environment during the learning process.…”
Section: Introductionmentioning
confidence: 99%
“…The main differences between the approaches rely in the way they model the dialogue manager's environment during the learning process. In [2] and [5], the environment is modeled as a set of independent modules (i.e. ASR system, user) processing information (this approach will be adopted in this paper).…”
Section: Introductionmentioning
confidence: 99%
“…The statistical optimization of dialogue management in dialogue systems through Reinforcement Learning (RL) has been an active thread of research for more than two decades (Levin et al, 1997; Lemon and Pietquin, 2007; Laroche et al, 2010; Gašić et al, 2012; Daubigney et al, 2012). Dialogue management has been successfully modelled as a Partially Observable Markov Decision Process (POMDP) (Williams and Young, 2007; Gašić et al, 2012), which leads to systems that can learn from data and which are robust to noise.…”
Section: Introductionmentioning
confidence: 99%
“…The great variability of these factors makes rapid design of dialogue strategies and reusability across tasks of previous work very complex. For these reasons, automatic learning of optimal strategies is currently a leading domain of researches [1] [2][3] [4]. Yet, the low amount of data generally available for learning and testing dialogue strategies does not contain enough information to explore the whole space of dialogue states (and of strategies).…”
Section: Introductionmentioning
confidence: 99%