2009
DOI: 10.1109/tasl.2008.2012071
|View full text |Cite
|
Sign up to set email alerts
|

The Hidden Agenda User Simulation Model

Abstract: Abstract-A key advantage of taking a statistical approach to spoken dialogue systems is the ability to formalise dialogue policy design as a stochastic optimization problem. However, since dialogue policies are learnt by interactively exploring alternative dialogue paths, conventional static dialogue corpora cannot be used directly for training and instead, a user simulator is commonly used. This paper describes a novel statistical user model based on a compact stack-like state representation called a user age… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
47
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 79 publications
(58 citation statements)
references
References 23 publications
0
47
0
Order By: Relevance
“…Policy learning was implemented by a Gaussian process via the GP-SARSA algorithm [16]. As policies typically require O(10 4 ) dialogues to converge, a simulated user [17] (operating at the semantic level) was used prior to the concluding human user trial. The natural language understanding and generation components used for human user trials were both hand crafted, using a Phoenix grammar [18] and templates (mapping system semantics to natural language) respectively.…”
Section: Dialogue System Descriptionmentioning
confidence: 99%
See 1 more Smart Citation
“…Policy learning was implemented by a Gaussian process via the GP-SARSA algorithm [16]. As policies typically require O(10 4 ) dialogues to converge, a simulated user [17] (operating at the semantic level) was used prior to the concluding human user trial. The natural language understanding and generation components used for human user trials were both hand crafted, using a Phoenix grammar [18] and templates (mapping system semantics to natural language) respectively.…”
Section: Dialogue System Descriptionmentioning
confidence: 99%
“…The size of this feature implicitly gives an indication of the cardinality ranges of the slots; both SFR and LAP have 6 'informable' (from the users perspective) slots, but the raw SFR feature is much larger as SFR contains slots with a greater number of possible values. All datasets for training RNN dialogue success models are obtained from training policies from random with a user simulator [17], producing supervised pairs: (sequence of turn level dialogue features, objective success/failure target label). The semantic error rate (SER) of the simulated user is set to 15% and data is balanced regarding target labels.…”
Section: Domains and Datasetsmentioning
confidence: 99%
“…At every turn t, input feature f t are extracted from the belief/action pair and used to update the hidden layer h t . From dialogues generated by a simulated user (Schatzmann and Young, 2009) supervised training pairs are created which consist of the turn level sequence of these feature vectors f t along with the scalar dialogue return as scored by an objective measure of task completion. Whilst the RNN models are trained on dialogue level supervised targets, we hypothesise that their subsequent turn level predictions can guide policy exploration via acting as informative reward shaping potentials.…”
Section: Reward Shaping With Rnn Predictionmentioning
confidence: 99%
“…In dialogue management, databased models use probabilistic models to define the dialogue strategy, i.e., given the dialogue history (including last user interaction) and, possibly, other environmental factors, they are given as inputs to a probabilistic model that provides as a result which is the next action to be performed by the system. In the last decade, the most used probabilistic models have been Markov Decision Processes (MDP) [19], [20] and Partially Observable MDP (POMDP) [21], [22], [23].…”
Section: A Dialogue Modelsmentioning
confidence: 99%