Proceedings of the Third Conference on Machine Translation: Shared Task Papers 2018
DOI: 10.18653/v1/w18-6435
|View full text |Cite
|
Sign up to set email alerts
|

A Pronoun Test Suite Evaluation of the English–German MT Systems at WMT 2018

Abstract: We evaluate the output of 16 English-to-German MT systems with respect to the translation of pronouns in the context of the WMT 2018 competition. We work with a test suite specifically designed to assess system quality in various fine-grained categories known to be problematic. The main evaluation scores come from a semi-automatic process, combining automatic reference matching with extensive manual annotation of uncertain cases. We find that current NMT systems are good at translating pronouns with intra-sent… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
27
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 32 publications
(28 citation statements)
references
References 19 publications
1
27
0
Order By: Relevance
“…The PROTEST evaluation confirms the findings of the WMT18 evaluation (Guillou et al, 2018). In both of these evaluations the pleonastic and event categories are the least problematic.…”
Section: Test Suite Metricssupporting
confidence: 74%
See 2 more Smart Citations
“…The PROTEST evaluation confirms the findings of the WMT18 evaluation (Guillou et al, 2018). In both of these evaluations the pleonastic and event categories are the least problematic.…”
Section: Test Suite Metricssupporting
confidence: 74%
“…Throughout, however, particular discourse phenomena are consistently targeted, as they are indeed indicators of globally good, cohesive and coherent texts. Pronouns (Hardmeier and Federico, 2010;Guillou, 2012;Hardmeier et al, 2013;Guillou et al, 2018) have been largely at the center of attention, and more recently the translation of pronouns in the context of their coreferential chains has been looked at (Lapshinova-Koltunski and Hardmeier, 2017;Lapshinova-Koltunski et al, 2019). Other devices studied are verbal tenses (Gong et al, 2012;Loáiciga et al, 2014;Ramm and Fraser, 2016) and connectives , although not using neural models.…”
Section: Related Work 21 Discoursementioning
confidence: 99%
See 1 more Smart Citation
“…Recently, it has been claimed that sentence-level NMT generates document-level errors, e.g. wrong coreference of pronouns/articles or inconsistent translations throughout a document (Guillou et al, 2018;Läubli et al, 2018).…”
Section: Introductionmentioning
confidence: 99%
“…Existing test suites focus e.g. on morphosyntactic and syn-tactic divergences between source and target language (Burchardt et al, 2017;Burlot and Yvon, 2017;Isabelle et al, 2017;Sennrich, 2017;Burlot et al, 2018;Macketanz et al, 2018) or on discourse phenomena (Guillou and Hardmeier, 2016;Bawden et al, 2018;Guillou et al, 2018).…”
Section: Introductionmentioning
confidence: 99%