Neural Machine Translation by Minimising the Bayes-risk with Respect
            to Syntactic Translation Lattices

Stahlberg, Felix; Gispert, Adrià de; Hasler, Eva; Byrne, Bill

doi:10.18653/v1/e17-2058

Cited by 35 publications

(41 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Finally, we map the word-level FSTs to the subword-level by composition with a mapping transducer T that applies byte pair encoding (Sennrich et al, 2016c, BPE) to the full words. Word-to-BPE mapping transducers have been used in prior work to combine word-level models with subword-level neural sequence models (Stahlberg et al, , 2017b(Stahlberg et al, , 2018b(Stahlberg et al, , 2017a.…”

Section: Fst-based Grammatical Error Correctionmentioning

confidence: 99%

The CUED’s Grammatical Error Correction Systems for BEA-2019

Stahlberg

Byrne

2019

Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

Self Cite

View full text Add to dashboard Cite

We describe two entries from the Cambridge University Engineering Department to the BEA 2019 Shared Task on grammatical error correction. Our submission to the lowresource track is based on prior work on using finite state transducers together with strong neural language models. Our system for the restricted track is a purely neural system consisting of neural language models and neural machine translation models trained with backtranslation and a combination of checkpoint averaging and fine-tuning -without the help of any additional tools like spell checkers. The latter system has been used inside a separate system combination entry in cooperation with the Cambridge University Computer Lab.

show abstract

Section: Fst-based Grammatical Error Correctionmentioning

confidence: 99%

The CUED’s Grammatical Error Correction Systems for BEA-2019

Stahlberg

Byrne

2019

Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

Self Cite

View full text Add to dashboard Cite

show abstract

“…Restricts the search space to a bag of words with or without repetition (Hasler et al, 2017 • consume(token) Update the internal predictor state by adding token to the current history.…”

Section: Predictorsmentioning

confidence: 99%

“…• nmt,ngramc,wc: MBR-based NMT following Stahlberg et al (2017) with n-gram posteriors extracted from an SMT lattice (ngramc) and a simple word penalty (wc).…”

Section: Example Predictor Constellationsmentioning

confidence: 99%

See 1 more Smart Citation

SGNMT – A Flexible NMT Decoding Platform for Quick Prototyping of New Models and Search Strategies

Stahlberg¹,

Hasler²,

Saunders³

et al. 2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Self Cite

View full text Add to dashboard Cite

This paper introduces SGNMT, our experimental platform for machine translation research. SGNMT provides a generic interface to neural and symbolic scoring modules (predictors) with left-to-right semantic such as translation models like NMT, language models, translation lattices, n-best lists or other kinds of scores and constraints. Predictors can be combined with other predictors to form complex decoding tasks. SGNMT implements a number of search strategies for traversing the space spanned by the predictors which are appropriate for different predictor constellations. Adding new predictors or decoding strategies is particularly easy, making it a very efficient tool for prototyping new research ideas. SGNMT is actively being used by students in the MPhil program in Machine Learning, Speech and Language Technology at the University of Cambridge for course work and theses, as well as for most of the research work in our group.

show abstract

“…• Even though the performance gap between NMT and traditional statistical machine translation (SMT) is growing rapidly on the task at hand, SMT can still improve very strong NMT ensembles. To combine NMT and SMT we follow Stahlberg et al (2017aStahlberg et al ( , 2018b and build a specialized n-gram LM for each sentence that computes the risk of hypotheses relative to SMT lattices.…”

Section: Introductionmentioning

confidence: 99%

CUED@WMT19:EWC&LMs

Stahlberg¹,

Saunders²,

Gispert³

et al. 2019

Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

Self Cite

View full text Add to dashboard Cite

Two techniques provide the fabric of the Cambridge University Engineering Department's (CUED) entry to the WMT19 evaluation campaign: elastic weight consolidation (EWC) and different forms of language modelling (LMs). We report substantial gains by finetuning very strong baselines on former WMT test sets using a combination of checkpoint averaging and EWC. A sentence-level Transformer LM and a document-level LM based on a modified Transformer architecture yield further gains. As in previous years, we also extract n-gram probabilities from SMT lattices which can be seen as a source-conditioned ngram LM.

show abstract

Neural Machine Translation by Minimising the Bayes-risk with Respect to Syntactic Translation Lattices

Cited by 35 publications

References 27 publications

The CUED’s Grammatical Error Correction Systems for BEA-2019

The CUED’s Grammatical Error Correction Systems for BEA-2019

SGNMT – A Flexible NMT Decoding Platform for Quick Prototyping of New Models and Search Strategies

CUED@WMT19:EWC&LMs

Contact Info

Product

Resources

About