Predicting User Satisfaction in Spoken Dialog System Evaluation With Collaborative Filtering

Yang, Zhaojun; Levow, G-A; Meng, Helen

doi:10.1109/jstsp.2012.2229965

Cited by 30 publications

(16 citation statements)

References 26 publications

(35 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Research into these questions is growing though [30], and will continue to given the natural progression towards multi-domain SDS [31,32,33]. A lot of work has looked at methods and metrics for evaluating SDS [34,35,36]. These have generally been considered as aids to system developers to experiment with design choices and recognise those that are leading to certain measures of good performance.…”

Section: Related Workmentioning

confidence: 99%

Multi-domain dialogue success classifiers for policy training

Vandyke

Gašić

et al. 2015

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

View full text Add to dashboard Cite

We propose a method for constructing dialogue success classifiers that are capable of making accurate predictions in domains unseen during training. Pooling and adaptation are also investigated for constructing multi-domain models when data is available in the new domain. This is achieved by reformulating the features input to the recurrent neural network models introduced in [1]. Importantly, on our task of main interest, this enables policy training in a new domain without the dialogue success classifier (which forms the reinforcement learning reward function) ever having seen data from that domain before. This occurs whilst incurring only a small reduction in performance relative to developing and using an in-domain dialogue success classifier. Finally, given the motivation with these dialogue success classifiers is to enable policy training with real users, we demonstrate that these initial policy training results obtained with a simulated user carry over to learning from paid human users.Index Terms-statistical spoken dialogue systems, dialogue success, multi-domain, policy training

show abstract

Section: Related Workmentioning

confidence: 99%

Multi-domain dialogue success classifiers for policy training

Vandyke

Gašić

et al. 2015

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

View full text Add to dashboard Cite

show abstract

“…One method for reward estimation is off-line learning with annotated data [121]. By taking the dialog utterances and intermediate annotations as input features, reward learning can be formulated as a supervised regression or classification task.…”

Section: User Goal Estimationmentioning

confidence: 99%

Recent advances and challenges in task-oriented dialog systems

Zhang

Takanobu

Huang

et al. 2020

Sci. China Technol. Sci.

119

View full text Add to dashboard Cite

Due to the significance and value in human-computer interaction and natural language processing, task-oriented dialog systems are attracting more and more attention in both academic and industrial communities. In this paper, we survey recent advances and challenges in task-oriented dialog systems. We also discuss three critical topics for task-oriented dialog systems: (1) improving data efficiency to facilitate dialog modeling in low-resource settings, (2) modeling multi-turn dynamics for dialog policy learning to achieve better task-completion performance, and (3) integrating domain ontology knowledge into the dialog model. Besides, we review the recent progresses in dialog evaluation and some widely-used corpora. We believe that this survey, though incomplete, can shed a light on future research in task-oriented dialog systems. task-oriented dialog systems, natural language understanding, dialog policy, dialog state tracking, natural language generation

show abstract

“…然而当系统真正与人进行交互的时候, 任务完成的程度是很难界定的, 不仅如此, 生成模型理论上的有效性等一系列问题使得 PARADISE 的评价效果不尽如人意 [4] . 因此基于标注语料的数据驱动型对话评价模型成为了一个被广泛讨论的方向: 2012 年有研究者提出用协同过滤的方法来实现对用户反馈的表示 [5] ; 利用重塑反馈函数也可以起到加速对话策略学习的目的 [6] ; Ultes 与 Minker 等人 [7] 的研究发现专家满意度对于对话系统的回复成功率有很大的影响. 所有的这些方法和尝试都表明, 优质的训练数据对于对话系统的生成结果是至关重要的.…”

Section: 任务型对话系统评价方法unclassified

Survey of evaluation methods for dialogue systems}{Survey of evaluation methods for dialogue systems

Zhang¹,

Zhang²,

Liu³

2017

Sci. Sin.-Inf.

View full text Add to dashboard Cite

This paper introduces the history of dialogue systems and their evaluation methods. The evaluation methods are categorized as either task-oriented dialogue systems or open domain dialogue systems. This paper investigates and summarizes the different methods of evaluating dialogue systems, analyzes the pros and cons of the different methods, discusses the emphasis of each method, and presents the progress of recent research for the two categories. For task-oriented dialogue systems, this paper introduces the recent research results of Steve Young. In addition, this paper sums up several widely used evaluation approaches. The evaluation methods for open domain chatting systems are explored from two angles: objective index evaluation and simulated artificial scoring. The various indices and different research ideas are analyzed and introduced as well. Finally, through summarizing and analyzing classical evaluation methods of dialogue systems as well as the newer evaluation methods based on neural network models, this study aims to predict developmental trends in evaluation methods for dialogue systems.

show abstract

Predicting User Satisfaction in Spoken Dialog System Evaluation With Collaborative Filtering

Cited by 30 publications

References 26 publications

Multi-domain dialogue success classifiers for policy training

Multi-domain dialogue success classifiers for policy training

Recent advances and challenges in task-oriented dialog systems

Survey of evaluation methods for dialogue systems}{Survey of evaluation methods for dialogue systems

Contact Info

Product

Resources

About