Proceedings of the Third Conference on Machine Translation: Research Papers 2018
DOI: 10.18653/v1/w18-6307
|View full text |Cite
|
Sign up to set email alerts
|

A Large-Scale Test Set for the Evaluation of Context-Aware Pronoun Translation in Neural Machine Translation

Abstract: The translation of pronouns presents a special challenge to machine translation to this day, since it often requires context outside the current sentence. Recent work on models that have access to information across sentence boundaries has seen only moderate improvements in terms of automatic evaluation metrics such as BLEU. However, metrics that quantify the overall translation quality are illequipped to measure gains from additional context. We argue that a different kind of evaluation is needed to assess ho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
104
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 129 publications
(147 citation statements)
references
References 21 publications
2
104
0
1
Order By: Relevance
“…The third rule that we conform to is to 1) create two contrastive source sentences for each lexical or syntactic ambiguity point, where each source sentence corresponds to one reasonable interpretation of the ambiguity point, and 2) to provide two contrastive translations for each created source sentence. This is similar to other linguistic evaluation by contrastive examples in the MT literature (Avramidis et al, 2019;Bawden et al, 2018;Müller et al, 2018;Sennrich, 2017). These two contrastive translations have similar wordings: one is correct and the other is not correct in that it translates the ambiguity part into the corresponding translation of the contrastive source sentence.…”
Section: Test Suite Designsupporting
confidence: 83%
“…The third rule that we conform to is to 1) create two contrastive source sentences for each lexical or syntactic ambiguity point, where each source sentence corresponds to one reasonable interpretation of the ambiguity point, and 2) to provide two contrastive translations for each created source sentence. This is similar to other linguistic evaluation by contrastive examples in the MT literature (Avramidis et al, 2019;Bawden et al, 2018;Müller et al, 2018;Sennrich, 2017). These two contrastive translations have similar wordings: one is correct and the other is not correct in that it translates the ambiguity part into the corresponding translation of the contrastive source sentence.…”
Section: Test Suite Designsupporting
confidence: 83%
“…As a result of this process, a translation pair can consist of multiple sentences as shown in Example (c) of Figure 1. We do not split them into single sentences, considering a recent trend of context-sensitive machine translation (Bawden et al, 2018;Müller et al, 2018;Zhang et al, 2018;Miculicich et al, 2018). One can use split sentences for training a model, but an important note is that there is no guarantee that all the internal sentences are perfectly aligned.…”
Section: Extracting Parallel Text Segmentsmentioning
confidence: 99%
“…Examples of widely-used datasets are those included in WMT (Bojar et al, 2018) and LDC 1 , while new evaluation datasets are being actively created (Michel and Neubig, 2018;Bawden et al, 2018; Müller et al, 2018). These existing datasets have mainly focused on translating plain text.…”
Section: Introductionmentioning
confidence: 99%
“…Lexical ambiguity as a challenge for machine translation has received a lot of attention in recent years. Rios Gonzales et al (2017) and Rios Gonzales et al (2018) focus on ambiguous German nouns, while Guillou et al (2018) and Müller et al (2018) investigate ambiguous English pronouns. Broader linguistic evaluations presented in Burchardt et al (2017) and Klubička et al (2018) also include ambiguity, but conjunctions are not mentioned in any context.…”
Section: Related Workmentioning
confidence: 99%