Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1331
|View full text |Cite
|
Sign up to set email alerts
|

On NMT Search Errors and Model Errors: Cat Got Your Tongue?

Abstract: We report on search errors and model errors in neural machine translation (NMT). We present an exact inference procedure for neural sequence models based on a combination of beam search and depth-first search. We use our exact search to find the global best model scores under a Transformer base model for the entire WMT15 English-German test set. Surprisingly, beam search fails to find these global best model scores in most cases, even with a very large beam size of 100. For more than 50% of the sentences, the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

10
141
2

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 100 publications
(153 citation statements)
references
References 15 publications
(21 reference statements)
10
141
2
Order By: Relevance
“…This problem has also been reported in other conditional generation tasks(Sountsov and Sarawagi, 2016;Stahlberg and Byrne, 2019); we leave it for future work.…”
supporting
confidence: 55%
“…This problem has also been reported in other conditional generation tasks(Sountsov and Sarawagi, 2016;Stahlberg and Byrne, 2019); we leave it for future work.…”
supporting
confidence: 55%
“…Neural sequence models trained with maximum likelihood estimation (MLE) have become a standard approach to modeling sequences in a variety of natural language applications such as machine translation (Bahdanau et al, 2015), dialogue modeling (Vinyals et al, 2015), and language modeling (Radford et al, 2019). Despite this success, MLEtrained neural sequence models have been shown to exhibit issues such as length bias (Sountsov and Sarawagi, 2016;Stahlberg and Byrne, 2019) and degenerate repetition (Holtzman et al, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…Koehn and Knowles (2017) raise 6 challenges for machine translation, including degrading performance for longer sentences, and degrading performance for larger beam sizes. Stahlberg and Byrne (2019) distinguish model errors (high probabilities of bad sequences) and search errors (failing to find sequences preferred by the model). They show that the global optimal translations (according to likelihood) are considerably worse than translations found by beam search.…”
Section: Related Workmentioning
confidence: 99%