Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation

Eikema, Bryan; Aziz, Wilker

doi:10.18653/v1/2022.emnlp-main.754

Cited by 5 publications

(11 citation statements)

References 43 publications

(76 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For MBR, we generate 1024 samples per segment using epsilon sampling and re-use the same samples as references. While this approach does not guarantee that the estimation of the expected utility is unbiased (Eikema and Aziz, 2022), it has empirically been found to work well (Freitag et al, 2023).…”

Section: Methodsmentioning

confidence: 99%

“…A line of research has focused on improving the efficiency of sampling-based MBR. Eikema and Aziz (2022) propose coarse-to-fine MBR, which prunes the hypotheses based on a cheaper metric, and N-by-S MBR, which uses fewer references than hypotheses. Cheng and Vlachos (2023) propose confidence-based pruning, where the number of hypotheses is iteratively reduced based on an increasing number of references.…”

Section: Background and Related Workmentioning

confidence: 99%

“…Figure 1: How accurately do MBR efficiency methods approximate standard MBR? In this validation experiment on newstest21, we gradually increase efficiency by using fewer references for pairwise utility estimation -either by subsampling the references (N-by-S; Eikema and Aziz, 2022) or by aggregating their representations using partial aggregation (Section 3.3). We report top-20 accuracy, which describes how often an efficiency method ranks the correct hypothesis (as selected by standard MBR) among the top 20 hypotheses.…”

Section: Application To Comet Metricmentioning

confidence: 99%

“…Analogously to coarse-to-fine MBR (Eikema and Aziz, 2022), we evaluate an aggregate-to-fine MBR approach. Specifically, we use the aggregate reference to prune the number of hypotheses to 20 in a first step.…”

Section: Aggregate-to-fine Mbrmentioning

confidence: 99%

“…Estimating utility through MC sampling has quadratic complexity in the number of samples, which limits practical application. Previous work suggested pruning the number of samples based on a cheaper metric or a smaller number of references (Eikema and Aziz, 2022;Cheng and Vlachos, 2023). In this paper, we propose reference aggregation, an alternative efficiency technique that exploits the fact that most common metrics represent text sequences in averageable form, e.g., as n-gram statistics or as embeddings.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning

Vamvas¹,

Sennrich²

2022

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

View full text Add to dashboard Cite

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Background and Related Workmentioning

confidence: 99%

Section: Application To Comet Metricmentioning

confidence: 99%

Section: Aggregate-to-fine Mbrmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning

Vamvas¹,

Sennrich²

2022

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

View full text Add to dashboard Cite

show abstract

Proceedings of the Big Picture Workshop

2023

View full text Add to dashboard Cite

A key contribution to being a successful researcher in natural language processing, as in any area, is having a clear overarching vision of what your body of research is trying to accomplish. Using my own 40-year career as an example, I will attempt to provide general advice on formulating and pursuing a coherent research vision. In particular, I will focus on formulating a unique, personal objective that exploits your specific talents, knowledge, and passions, and that is distinct from the current popular trends in the field. I will also focus on formulating a vision that bridges existing fields of study to produce an overarching agenda that unifies previously disparate ideas.

show abstract

It’s MBR All the Way Down: Modern Generation Techniques Through the Lens of Minimum Bayes Risk

Bertsch,

Xie,

Neubig

et al. 2023

Proceedings of the Big Picture Workshop

View full text Add to dashboard Cite

Minimum Bayes Risk (MBR) decoding is a method for choosing the outputs of a machine learning system based not on the output with the highest probability, but the output with the lowest risk (expected error) among multiple candidates. It is a simple but powerful method: for an additional cost at inference time, MBR provides reliable several-point improvements across metrics for a wide variety of tasks without any additional data or training. Despite this, MBR is not frequently applied in NLP works, and knowledge of the method itself is limited. We first provide an introduction to the method and the recent literature. We show that several recent methods that do not reference MBR can be written as special cases of MBR; this reformulation provides additional theoretical justification for the performance of these methods, explaining some results that were previously only empirical. We provide theoretical and empirical results about the effectiveness of various MBR variants and make concrete recommendations for the application of MBR in NLP models, including future directions in this area.

show abstract

Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation

Cited by 5 publications

References 43 publications

As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning

As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning

Proceedings of the Big Picture Workshop

It’s MBR All the Way Down: Modern Generation Techniques Through the Lens of Minimum Bayes Risk

Contact Info

Product

Resources

About