Proceedings of the 14th International Conference on Natural Language Generation 2021
DOI: 10.18653/v1/2021.inlg-1.14
|View full text |Cite
|
Sign up to set email alerts
|

Underreporting of errors in NLG output, and what to do about it

Emiel van Miltenburg,
Miruna Clinciu,
Ondřej Dušek
et al.

Abstract: We observe a severe under-reporting of the different kinds of errors that Natural Language Generation systems make. This is a problem, because mistakes are an important indicator of where systems should still be improved. If authors only report overall performance metrics, the research community is left in the dark about the specific weaknesses that are exhibited by 'state-of-the-art' research. Next to quantifying the extent of error under-reporting, this position paper provides recommendations for error ident… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
0
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 41 publications
0
0
0
Order By: Relevance
“…Evaluation in low-resource NLG In addition to the specific challenges and mitigation strategies for system development above, evaluation has its own challenges in the low-resource setting and is a promising direction for future work in itself. For instance, having less validation and test data reduces the applicability of automated, referencebased evaluations, necessitating alternative evaluation strategies such as an emphasis on error analysis (van Miltenburg et al, 2021) or standardised human evaluations (Howcroft et al, 2020). Methods for maximising the efficiency of input from domain and language experts will also be necessary for human evaluations when access to these persons is more limited than usual.…”
Section: Discussion and Promising Directionsmentioning
confidence: 99%
“…Evaluation in low-resource NLG In addition to the specific challenges and mitigation strategies for system development above, evaluation has its own challenges in the low-resource setting and is a promising direction for future work in itself. For instance, having less validation and test data reduces the applicability of automated, referencebased evaluations, necessitating alternative evaluation strategies such as an emphasis on error analysis (van Miltenburg et al, 2021) or standardised human evaluations (Howcroft et al, 2020). Methods for maximising the efficiency of input from domain and language experts will also be necessary for human evaluations when access to these persons is more limited than usual.…”
Section: Discussion and Promising Directionsmentioning
confidence: 99%
“…For a detailed discussion of good practices in error analysis, see e.g. van Miltenburg et al (2021).…”
Section: Limitationsmentioning
confidence: 99%
“…sign choices. A suitable interface can also encourage researchers to step away from unreliable automatic metrics (Gehrmann et al, 2022) and focus on manual error analysis (van Miltenburg et al, 2021(van Miltenburg et al, , 2023.…”
Section: Web Interfacementioning
confidence: 99%