Proceedings of the 2nd Workshop on Human Evaluation of NLP Systems (HumEval) 2022
DOI: 10.18653/v1/2022.humeval-1.2
|View full text |Cite
|
Sign up to set email alerts
|

A Methodology for the Comparison of Human Judgments With Metrics for Coreference Resolution

Abstract: We propose a method for investigating the interpretability of metrics used for the coreference resolution task through comparisons with human judgments. We provide a corpus with annotations of different error types and human evaluations of their gravity. Our preliminary analysis shows that metrics considerably overlook several error types and overlook errors in general in comparison to humans. This study is conducted on French texts, but the methodology should be language-independent.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 8 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?