English-Czech Systems in WMT19: Document-Level Transformer

Popel, Martin; Macháček, Dominik; Auersperger, Michal; Bojar, Ondřej; Pecina, Pavel

doi:10.18653/v1/w19-5337

Cited by 19 publications

(22 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…(no associated paper) CAIRE CUNI Charles University (Popel et al, 2019;Kocmi and Bojar, 2019) and (Kvapilíková et al, 2019)…”

Section: Afrlmentioning

confidence: 99%

“…and CUNI-TRANSFORMER-T2T2019 (Popel et al, 2019) are trained in the T2T framework following the last year submission (Popel, 2018), but training on WMT19 document-level parallel and monoliongual data. During decoding, each document is split into overlapping multi-sentence segments, where only the "middle" sentences in each segment are used for the final translation.…”

Section: Cuni-doctransformer-t2t2019mentioning

confidence: 99%

“…(Hokamp et al, 2019) The Aylien research team built a Multilingual NMT system which is trained on all WMT2019 language pairs in all directions, then fine-tuned for a small number of iterations on Gujarati-English data only, including some self-backtranslated data. (no associated paper) CAIRE CUNI Charles University (Popel et al, 2019;Kocmi and Bojar, 2019) and (Kvapilíková et al, 2019) DBMS-KU Kumamoto University, Telkom University, Indonesian Institute of Sciences (Budiwati et al, 2019) 2.5.6 BTRANS Unfortunately, no details are available for this system.…”

Section: Apprentice-c (Li and Specia 2019)mentioning

confidence: 99%

See 2 more Smart Citations

Findings of the 2019 Conference on Machine Translation (WMT19)

Barrault¹,

Bojar²,

Costa-jussà³

et al. 2019

Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

Self Cite

332

261

View full text Add to dashboard Cite

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019. Participants were asked to build machine translation systems for any of 18 language pairs, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. The task was also opened up to additional test suites to probe specific aspects of translation.

show abstract

“…(no associated paper) CAIRE CUNI Charles University (Popel et al, 2019;Kocmi and Bojar, 2019) and (Kvapilíková et al, 2019)…”

Section: Afrlmentioning

confidence: 99%

Section: Cuni-doctransformer-t2t2019mentioning

confidence: 99%

Section: Apprentice-c (Li and Specia 2019)mentioning

confidence: 99%

See 1 more Smart Citation

Findings of the 2019 Conference on Machine Translation (WMT19)

Barrault¹,

Bojar²,

Costa-jussà³

et al. 2019

Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

Self Cite

332

261

View full text Add to dashboard Cite

show abstract

“…We use the Transformer architecture by Vaswani et al (2017) implemented in Marian framework (Junczys-Dowmunt et al, 2018) to train an NMT model on the synthetic corpus produced by the PBMT model. The model setup, training and decoding hyperparameters are identical to the CUNI Marian systems in English-to-Czech news translation task in WMT19 (Popel et al, 2019), but in this case, due to smaller and noisier training data, we set the dropout between Transformer layers to 0.3. We use 8 Quadro P5000 GPUs with 16GB memory.…”

Section: Model and Trainingmentioning

confidence: 99%

“…Our other comparison system, Benchmark-TransferEN, was first trained as an English-to-Czech NMT system (see CUNI Transformer Marian for the English-to-Czech news translation task in WMT19 by Popel et al (2019)) and then finetuned for 6 days on the SynthCorpus-noCzechreordered-NER. The vocabulary remained unchanged, it was trained on the English-Czech training corpus.…”

Section: Benchmarksmentioning

confidence: 99%

CUNI Systems for the Unsupervised News Translation Task in WMT 2019

Kvapilíková¹,

Macháček²,

Bojar³

2019

Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

Self Cite

View full text Add to dashboard Cite

In this paper we describe the CUNI translation system used for the unsupervised news shared task of the ACL 2019 Fourth Conference on Machine Translation (WMT19). We follow the strategy of Artetxe et al. (2018b), creating a seed phrase-based system where the phrase table is initialized from cross-lingual embedding mappings trained on monolingual data, followed by a neural machine translation system trained on synthetic parallel data. The synthetic corpus was produced from a monolingual corpus by a tuned PBMT model refined through iterative back-translation. We further focus on the handling of named entities, i.e. the part of vocabulary where the cross-lingual embedding mapping suffers most. Our system reaches a BLEU score of 15.3 on the German-Czech WMT19 shared task.

show abstract

A survey of context in neural machine translation and its evaluation

Castilho,

Knowles

2024

Nat. lang. processing

View full text Add to dashboard Cite

The question of context in neural machine translation often focuses on topics related to document-level translation or intersentential context. However, there is a wide range of other aspects that can be considered under the umbrella of context. In this work, we survey ways that researchers have incorporated context into neural machine translation systems and the evaluation thereof. This includes building translation systems that operate at the paragraph level or the document level or ones that translate at the sentence level but incorporate information from other sentences. We also consider how issues like terminology consistency, anaphora, and world knowledge or external information can be considered as types of context relevant to the task of machine translation and its evaluation. Closely tied to these topics is the question of how to best evaluate machine translation output in a way that is sensitive to the contexts in which it appears. To this end, we discuss work on incorporating context into both human and automatic evaluations of machine translation quality. Furthermore, we also discuss recent experiments in the field as they relate to the use of large language models in translation and evaluation. We conclude with a view of the future of machine translation, where we expect to see issues of context continue to come to the forefront.

show abstract

English-Czech Systems in WMT19: Document-Level Transformer

Cited by 19 publications

References 8 publications

Findings of the 2019 Conference on Machine Translation (WMT19)

Findings of the 2019 Conference on Machine Translation (WMT19)

CUNI Systems for the Unsupervised News Translation Task in WMT 2019

A survey of context in neural machine translation and its evaluation

Contact Info

Product

Resources

About