Interspeech 2011 2011
DOI: 10.21437/interspeech.2011-434
|View full text |Cite
|
Sign up to set email alerts
|

Uncertainty management for on-line optimisation of a POMDP-based large-scale spoken dialogue system

Abstract: The optimization of dialogue policies using reinforcement learning (RL) is now an accepted part of the state of the art in spoken dialogue systems (SDS). Yet, it is still the case that the commonly used training algorithms for SDS require a large number of dialogues and hence most systems still rely on artificial data generated by a user simulator. Optimization is therefore performed off-line before releasing the system to real users. Gaussian Processes (GP) for RL have recently been applied to dialogue system… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2011
2011
2017
2017

Publication Types

Select...
5
1

Relationship

3
3

Authors

Journals

citations
Cited by 11 publications
(1 citation statement)
references
References 19 publications
0
1
0
Order By: Relevance
“…Due to the need of learning with real users through online interactions, an efficient exploration of the state-action space is critical. The Q-function of each state-action pair can be augmented with an estimate of its uncertainty to guide exploration to achieve higher performance and efficient learning [4]. Uncertainty estimates in the policy allow the system to generalise across different noise levels and mitigate errors incurred by speech recognition, therefore resulting in a more robust dialogue manager.…”
Section: Introductionmentioning
confidence: 99%
“…Due to the need of learning with real users through online interactions, an efficient exploration of the state-action space is critical. The Q-function of each state-action pair can be augmented with an estimate of its uncertainty to guide exploration to achieve higher performance and efficient learning [4]. Uncertainty estimates in the policy allow the system to generalise across different noise levels and mitigate errors incurred by speech recognition, therefore resulting in a more robust dialogue manager.…”
Section: Introductionmentioning
confidence: 99%