Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1) 2019
DOI: 10.18653/v1/w19-5333
|View full text |Cite
|
Sign up to set email alerts
|

Facebook FAIR’s WMT19 News Translation Task Submission

Abstract: This paper describes Facebook FAIR's submission to the WMT19 shared news translation task. We participate in four language directions, English ↔ German and English ↔ Russian in both directions. Following our submission from last year, our baseline systems are large BPE-based transformer models trained with the FAIRSEQ sequence modeling toolkit. This year we experiment with different bitext data filtering schemes, as well as with adding filtered back-translated data. We also ensemble and fine-tune our models on… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
138
0
1

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 229 publications
(169 citation statements)
references
References 19 publications
0
138
0
1
Order By: Relevance
“…Whereas several recent papers have demonstrated that the noisy channel decomposition has benefits when translating sentences one-by-one (Yu et al, 2017;Ng et al, 2019), in this paper we show that this decomposition is particularly suited to tackling the problem of translating complete documents. Although using crosssentence context and maintaining cross-document consistency has long been recognized as essential to the translation problem (Tiedemann and Scherrer, 2017;Bawden et al, 2018, inter alia), operationalizing this in models has been challenging for several reasons.…”
Section: Introductionmentioning
confidence: 59%
“…Whereas several recent papers have demonstrated that the noisy channel decomposition has benefits when translating sentences one-by-one (Yu et al, 2017;Ng et al, 2019), in this paper we show that this decomposition is particularly suited to tackling the problem of translating complete documents. Although using crosssentence context and maintaining cross-document consistency has long been recognized as essential to the translation problem (Tiedemann and Scherrer, 2017;Bawden et al, 2018, inter alia), operationalizing this in models has been challenging for several reasons.…”
Section: Introductionmentioning
confidence: 59%
“…We remove prompts only containing stopwords/ punctuations or longer than 10 words to reduce noise. We use the round-trip English-German neural machine translation models pre-trained on WMT'19 (Ng et al, 2019) for back-translation, as English-German is one of the most highly resourced language pairs. 7 When optimizing ensemble parameters, we use Adam (Kingma and Ba, 2015) with default parameters and batch size of 32.…”
Section: Experimental Settingsmentioning
confidence: 99%
“…: Uchendu et al, (2020) provide human written news articles related to politics category. Utilizing the title of the human written news article as prompt, the authors generate corresponding machine generated article from eight TGMs, which includes CTRL (Keskar et al, 2019), GPT-1 (Radford et al, 2018), GPT-2 (Radford et al, 2019), GROVER (Zellers et al, 2019), XLM (Conneau and Lample, 2019), XLNet , PPLM (Dathathri et al, 2020), and FAIR (Ng et al, 2019). 11 Similar to the tweets dataset, this news dataset lets us study the generalizability of the detector with respect to the TGM that produced the text.…”
Section: Future Research Directionsmentioning
confidence: 99%