Analysis of a new simulation approach to dialog system evaluation

Engelbrecht, Klaus-Peter; Quade, Michael H.; Möller, Sebastian

doi:10.1016/j.specom.2009.06.007

Cited by 20 publications

(10 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Recall, precision and balanced F-measure (Engelbrecht et al 2009) were employed to evaluate the accuracy of the participants' stress pattern labelling, which are widely used as classification accuracy criteria. In this paper, recall is defined as the number of syllables correctly labelled by a participant as stressed divided by the total number of stressed syllables.…”

Section: Resultsmentioning

confidence: 99%

Automatic stress exaggeration by prosody modification to assist language learners perceive sentence stress

Wang

Silva

2011

Int J Speech Technol

View full text Add to dashboard Cite

In this paper, we propose a set of automatic stress exaggeration methods that can enlarge the differences between stressed and unstressed syllables. Our stress exaggeration methods can be used in computer-aided language learning systems to assist second language learners perceive stress patterns. The intention of our automatic stress exaggeration methods is to support hyper-pronunciation training which is commonly used in classrooms by teachers. In hyper-pronunciation training, exaggeration is used to help learners increase their awareness of acoustic features and effectively apply these features into their pronunciation. Duration, pitch and intensity have been claimed to be the main acoustic features that are closely related to stress in English language. Thus, four stress exaggeration methods are proposed in this paper: (i) duration-based stress exaggeration, (ii) pitch-based stress exaggeration, (iii) intensitybased stress exaggeration, and (iv) a combined stress exaggeration method that integrates the duration-based, pitchbased and intensity-based exaggeration methods. Our perceptual experimental results show that resynthesised stimuli by our proposed stress exaggerated methods can help learners of English as a Second Language (ESL) better perceive English stress patterns significantly.

show abstract

Section: Resultsmentioning

confidence: 99%

Automatic stress exaggeration by prosody modification to assist language learners perceive sentence stress

Wang

Silva

2011

Int J Speech Technol

View full text Add to dashboard Cite

show abstract

“…Although cognitive modelling is an active research field, so far it has not been received particularly well by usability practitioners and only rarely finds its way into non-academic evaluations (Engelbrecht et al 2009;Kieras 2003). Reasons are their often high complexity (Kieras 2003) and possibly the aforementioned low level of the information possible to gain with cognitive modelling.…”

Section: Model-based Evaluationmentioning

confidence: 99%

“…if the user is "satisfied" with the system, and therefore rather utilizes top-down strategies. Here, user models are usually defined based on real user data and are not necessarily linked to cognitive theories (Engelbrecht et al 2009). Most of these methods and algorithms were developed for spoken dialogue systems, with PARADISE (Paradigm for Dialogue System Evaluation) (Walker et al 1997) likely being the most widespread one.…”

Section: Model-based Evaluationmentioning

confidence: 99%

Evaluating embodied conversational agents in multimodal interfaces

et al. 2015

Self Cite

View full text Add to dashboard Cite

Based on cross-disciplinary approaches to Embodied Conversational Agents, evaluation methods for such human-computer interfaces are structured and presented. An introductory systematisation of evaluation topics from a conversational perspective is followed by an explanation of social-psychological phenomena studied in interaction with Embodied Conversational Agents, and how these can be used for evaluation purposes. Major evaluation concepts and appropriate assessment instruments -established and new ones -are presented, including questionnaires, annotations and log-files. An exemplary evaluation and guidelines provide hands-on information on planning and preparing such endeavours.

show abstract

“…All models are derived from Levin, Pieraccini and Eckert (2000), but mimic human user behaviors to different degrees. In contrast, the goal of the current paper is to investigate how human-like the simulated behaviors are, which is an important factor to consider when using user simulations for dialog system evaluation (Engelbrecht, Quade and Möller 2009). Since our ultimate goal is to use these simulations to help dialog system development, we would like to keep the simulation models as simple as possible.…”

Section: User Simulationsmentioning

confidence: 99%

Assessing user simulation for dialog systems using human judges and automatic evaluation measures

Ai¹,

Litman²

2011

Nat. Lang. Eng.

View full text Add to dashboard Cite

While different user simulations are built to assist dialog system development, there is an increasing need to quickly assess the quality of the user simulations reliably. Previous studies have proposed several automatic evaluation measures for this purpose. However, the validity of these evaluation measures has not been fully proven. We present an assessment study in which human judgments are collected on user simulation qualities as the gold standard to validate automatic evaluation measures. We show that a ranking model can be built using the automatic measures to predict the rankings of the simulations in the same order as the human judgments. We further show that the ranking model can be improved by using a simple feature that utilizes time-series analysis.

show abstract

Analysis of a new simulation approach to dialog system evaluation

Cited by 20 publications

References 26 publications

Automatic stress exaggeration by prosody modification to assist language learners perceive sentence stress

Automatic stress exaggeration by prosody modification to assist language learners perceive sentence stress

Evaluating embodied conversational agents in multimodal interfaces

Assessing user simulation for dialog systems using human judges and automatic evaluation measures

Contact Info

Product

Resources

About