2017
DOI: 10.26615/issn.1314-9156.2017_003
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating Dialogs based on Grice’s Maxims

Abstract: There is no agreed upon standard for the evaluation of conversational dialog systems, which are well-known to be hard to evaluate due to the difficulty in pinning down metrics that will correspond to human judgements and the subjective nature of human judgment itself. We explored the possibility of using Grice's Maxims to evaluate effective communication in conversation. We collected some system generated dialogs from popular conversational chatbots across the spectrum and conducted a survey to see how the hum… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 14 publications
0
4
0
Order By: Relevance
“…Our approach is different as RUQ is a diagnostic metric. Jwalapuram (2017) propose a Gricean dialog evaluation where humans rate performance on a Likert scale for each category. Qwaider et al (2017) consider the QUANTITY, RELATION, and MANNER maxims for ranking community question answers.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…Our approach is different as RUQ is a diagnostic metric. Jwalapuram (2017) propose a Gricean dialog evaluation where humans rate performance on a Likert scale for each category. Qwaider et al (2017) consider the QUANTITY, RELATION, and MANNER maxims for ranking community question answers.…”
Section: Related Workmentioning
confidence: 99%
“…This mimics a setting where a single researcher would like to do model selection, but we do not consider system evaluation, and leave that to future work. Jwalapuram (2017) propose a Gricean human evaluation dialog. The evaluator is asked to rate performance on a Likert scale for each category.…”
Section: Limitationsmentioning
confidence: 99%
See 2 more Smart Citations