Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1) 2019
DOI: 10.18653/v1/w19-5354
|View full text |Cite
|
Sign up to set email alerts
|

The MuCoW Test Suite at WMT 2019: Automatically Harvested Multilingual Contrastive Word Sense Disambiguation Test Sets for Machine Translation

Abstract: Supervised Neural Machine Translation (NMT) systems currently achieve impressive translation quality for many language pairs. One of the key features of a correct translation is the ability to perform word sense disambiguation (WSD), i.e., to translate an ambiguous word with its correct sense. Existing evaluation benchmarks on WSD capabilities of translation systems rely heavily on manual work and cover only few language pairs and few word types. We present MU-COW, a multilingual contrastive test suite that co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
19
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 28 publications
(26 citation statements)
references
References 30 publications
1
19
0
1
Order By: Relevance
“…Table 4 gives the best accuracy on in-domain and out-of-domain test sets. The accuracy of both models has a substantial drop on the out-of-domain test set which is consistent with the finding from Raganato et al (2019). The drop of char-d7 is even bigger than that of bpe-d7 which indicates that CHAR models are not more robust to domain-mismatch when learning word senses compared to BPE-based models.…”
Section: Robustness To Domain-mismatchsupporting
confidence: 82%
See 1 more Smart Citation
“…Table 4 gives the best accuracy on in-domain and out-of-domain test sets. The accuracy of both models has a substantial drop on the out-of-domain test set which is consistent with the finding from Raganato et al (2019). The drop of char-d7 is even bigger than that of bpe-d7 which indicates that CHAR models are not more robust to domain-mismatch when learning word senses compared to BPE-based models.…”
Section: Robustness To Domain-mismatchsupporting
confidence: 82%
“…For the WSD probing task, we use the FI-EN part of the MuCoW (Raganato et al, 2019) test set, which is a multilingual test suite for WSD in the WMT19 shared task. It has 2,117 annotated sentences.…”
Section: Datamentioning
confidence: 99%
“…In this respect, a model with predefined fixed patterns may struggle to encode global semantic features. To this end, we evaluate our models on two German-English WSD test suites, ContraWSD (Rios Gonzales et al, 2017) and MuCoW (Raganato et al, 2019). 9 Table 6 shows the performance of our models on the WSD benchmarks.…”
Section: Word Sense Disambiguationmentioning
confidence: 99%
“…We next compile a parallel lexicon of homograph translations, prioritizing a high coverage of all possible senses. Similar to (Raganato et al, 2019), we obtain sense-specific translations from crosslingual BabelNet (Navigli and Ponzetto, 2010) synsets. Since BabelNet entries vary in their granularity, we iteratively merge related synsets as long as they have at least three German translations in common or share at least one definition.…”
Section: Resource Collectionmentioning
confidence: 99%