2009
DOI: 10.1016/j.specom.2009.06.007
|View full text |Cite
|
Sign up to set email alerts
|

Analysis of a new simulation approach to dialog system evaluation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2011
2011
2023
2023

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 20 publications
(10 citation statements)
references
References 26 publications
0
10
0
Order By: Relevance
“…Recall, precision and balanced F-measure (Engelbrecht et al 2009) were employed to evaluate the accuracy of the participants' stress pattern labelling, which are widely used as classification accuracy criteria. In this paper, recall is defined as the number of syllables correctly labelled by a participant as stressed divided by the total number of stressed syllables.…”
Section: Resultsmentioning
confidence: 99%
“…Recall, precision and balanced F-measure (Engelbrecht et al 2009) were employed to evaluate the accuracy of the participants' stress pattern labelling, which are widely used as classification accuracy criteria. In this paper, recall is defined as the number of syllables correctly labelled by a participant as stressed divided by the total number of stressed syllables.…”
Section: Resultsmentioning
confidence: 99%
“…Although cognitive modelling is an active research field, so far it has not been received particularly well by usability practitioners and only rarely finds its way into non-academic evaluations (Engelbrecht et al 2009;Kieras 2003). Reasons are their often high complexity (Kieras 2003) and possibly the aforementioned low level of the information possible to gain with cognitive modelling.…”
Section: Model-based Evaluationmentioning
confidence: 99%
“…if the user is "satisfied" with the system, and therefore rather utilizes top-down strategies. Here, user models are usually defined based on real user data and are not necessarily linked to cognitive theories (Engelbrecht et al 2009). Most of these methods and algorithms were developed for spoken dialogue systems, with PARADISE (Paradigm for Dialogue System Evaluation) (Walker et al 1997) likely being the most widespread one.…”
Section: Model-based Evaluationmentioning
confidence: 99%
“…All models are derived from Levin, Pieraccini and Eckert (2000), but mimic human user behaviors to different degrees. In contrast, the goal of the current paper is to investigate how human-like the simulated behaviors are, which is an important factor to consider when using user simulations for dialog system evaluation (Engelbrecht, Quade and Möller 2009). Since our ultimate goal is to use these simulations to help dialog system development, we would like to keep the simulation models as simple as possible.…”
Section: User Simulationsmentioning
confidence: 99%