Facebook FAIR’s WMT19 News Translation Task Submission

Ng, Nathan; Yee, Kyra; Baevski, Alexei; Ott, Myle; Auli, Michael; Edunov, Sergey

doi:10.18653/v1/w19-5333

Cited by 229 publications

(169 citation statements)

References 19 publications

Supporting

Mentioning

138

Contrasting

Unclassified

Order By: Relevance

“…Whereas several recent papers have demonstrated that the noisy channel decomposition has benefits when translating sentences one-by-one (Yu et al, 2017;Ng et al, 2019), in this paper we show that this decomposition is particularly suited to tackling the problem of translating complete documents. Although using crosssentence context and maintaining cross-document consistency has long been recognized as essential to the translation problem (Tiedemann and Scherrer, 2017;Bawden et al, 2018, inter alia), operationalizing this in models has been challenging for several reasons.…”

Section: Introductionmentioning

confidence: 59%

Better Document-Level Machine Translation with Bayes’ Rule

Sartran

Stokowiec

et al. 2020

Transactions of the Association for Computational Linguistics

View full text Add to dashboard Cite

We show that Bayes’ rule provides an effective mechanism for creating document translation models that can be learned from only parallel sentences and monolingual documents a compelling benefit because parallel documents are not always available. In our formulation, the posterior probability of a candidate translation is the product of the unconditional (prior) probability of the candidate output document and the “reverse translation probability” of translating the candidate output back into the source language. Our proposed model uses a powerful autoregressive language model as the prior on target language documents, but it assumes that each sentence is translated independently from the target to the source language. Crucially, at test time, when a source document is observed, the document language model prior induces dependencies between the translations of the source sentences in the posterior. The model’s independence assumption not only enables efficient use of available data, but it additionally admits a practical left-to-right beam-search algorithm for carrying out inference. Experiments show that our model benefits from using cross-sentence context in the language model, and it outperforms existing document translation approaches.

show abstract

Section: Introductionmentioning

confidence: 59%

Better Document-Level Machine Translation with Bayes’ Rule

Sartran

Stokowiec

et al. 2020

Transactions of the Association for Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…We remove prompts only containing stopwords/ punctuations or longer than 10 words to reduce noise. We use the round-trip English-German neural machine translation models pre-trained on WMT'19 (Ng et al, 2019) for back-translation, as English-German is one of the most highly resourced language pairs. 7 When optimizing ensemble parameters, we use Adam (Kingma and Ba, 2015) with default parameters and batch size of 32.…”

Section: Experimental Settingsmentioning

confidence: 99%

How Can We Know What Language Models Know?

Jiang

Araki³

et al. 2020

Transactions of the Association for Computational Linguistics

669

490

View full text Add to dashboard Cite

Recent work has presented intriguing results examining the knowledge contained in language models (LMs) by having the LM fill in the blanks of prompts such as “ Obama is a __ by profession”. These prompts are usually manually created, and quite possibly sub-optimal; another prompt such as “ Obama worked as a __ ” may result in more accurately predicting the correct profession. Because of this, given an inappropriate prompt, we might fail to retrieve facts that the LM does know, and thus any given prompt only provides a lower bound estimate of the knowledge contained in an LM. In this paper, we attempt to more accurately estimate the knowledge contained in LMs by automatically discovering better prompts to use in this querying process. Specifically, we propose mining-based and paraphrasing-based methods to automatically generate high-quality and diverse prompts, as well as ensemble methods to combine answers from different prompts. Extensive experiments on the LAMA benchmark for extracting relational knowledge from LMs demonstrate that our methods can improve accuracy from 31.1% to 39.6%, providing a tighter lower bound on what LMs know. We have released the code and the resulting LM Prompt And Query Archive (LPAQA) at https://github.com/jzbjyb/LPAQA .

show abstract

“…: Uchendu et al, (2020) provide human written news articles related to politics category. Utilizing the title of the human written news article as prompt, the authors generate corresponding machine generated article from eight TGMs, which includes CTRL (Keskar et al, 2019), GPT-1 (Radford et al, 2018), GPT-2 (Radford et al, 2019), GROVER (Zellers et al, 2019), XLM (Conneau and Lample, 2019), XLNet , PPLM (Dathathri et al, 2020), and FAIR (Ng et al, 2019). 11 Similar to the tweets dataset, this news dataset lets us study the generalizability of the detector with respect to the TGM that produced the text.…”

Section: Future Research Directionsmentioning

confidence: 99%

Automatic Detection of Machine Generated Text: A Critical Survey

Jawahar¹,

Abdul-Mageed²,

Lakshmanan³

2020

Proceedings of the 28th International Conference on Computational Linguistics

View full text Add to dashboard Cite

Text generative models (TGMs) excel in producing text that matches the style of human language reasonably well. Such TGMs can be misused by adversaries, e.g., by automatically generating fake news and fake product reviews that can look authentic and fool humans. Detectors that can distinguish text generated by TGM from human written text play a vital role in mitigating such misuse of TGMs. Recently, there has been a flurry of works from both natural language processing (NLP) and machine learning (ML) communities to build accurate detectors for English. Despite the importance of this problem, there is currently no work that surveys this fast-growing literature and introduces newcomers to important research challenges. In this work, we fill this void by providing a critical survey and review of this literature to facilitate a comprehensive understanding of this problem. We conduct an in-depth error analysis of the state-of-the-art detector and discuss research directions to guide future work in this exciting area.

show abstract

Facebook FAIR’s WMT19 News Translation Task Submission

Cited by 229 publications

References 19 publications

Better Document-Level Machine Translation with Bayes’ Rule

Better Document-Level Machine Translation with Bayes’ Rule

How Can We Know What Language Models Know?

Automatic Detection of Machine Generated Text: A Critical Survey

Contact Info

Product

Resources

About