Incorporating Discrete Translation Lexicons into Neural Machine Translation

Arthur, Philip; Neubig, Graham; Nakamura, Satoshi

doi:10.18653/v1/d16-1162

Cited by 165 publications

(142 citation statements)

References 29 publications

Supporting

Mentioning

140

Contrasting

Unclassified

Order By: Relevance

“…For instance, prior work used the alignments provided by the attention model to interpolate word translation decisions with traditional probabilistic dictionaries (Arthur et al, 2016), for the introduction of coverage and fertility models (Tu et al, 2016), etc. But is the attention model in fact the proper means? To examine this, we compare the soft alignment matrix (the sequence of attention vectors) with word alignments obtained by traditional word alignment methods.…”

Section: Word Alignmentmentioning

confidence: 99%

Six Challenges for Neural Machine Translation

Koehn¹,

Knowles²

2017

Proceedings of the First Workshop on Neural Machine Translation

878

673

View full text Add to dashboard Cite

show abstract

Section: Word Alignmentmentioning

confidence: 99%

Six Challenges for Neural Machine Translation

Koehn¹,

Knowles²

2017

Proceedings of the First Workshop on Neural Machine Translation

878

673

View full text Add to dashboard Cite

show abstract

“…A straightforward application of word alignments is to generate bilingual lexica from parallel corpora. Word alignments have also been used for external dictionary assisted translation (Chatterjee et al, 2017;Alkhouli et al, 2018;Arthur et al, 2016) to improve translation of low frequency words or to comply with certain terminology guidelines. Documents and webpages often contain word annotations such as formatting styles and hyperlinks, which need to be preserved in the translation.…”

Section: Introductionmentioning

confidence: 99%

Jointly Learning to Align and Translate with Transformer Models

Garg¹,

Peitz²,

Nallasamy³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

108

132

View full text Add to dashboard Cite

The state of the art in machine translation (MT) is governed by neural approaches, which typically provide superior translation accuracy over statistical approaches. However, on the closely related task of word alignment, traditional statistical word alignment models often remain the go-to solution. In this paper, we present an approach to train a Transformer model to produce both accurate translations and alignments. We extract discrete alignments from the attention probabilities learnt during regular neural machine translation model training and leverage them in a multi-task framework to optimize towards translation and alignment objectives. We demonstrate that our approach produces competitive results compared to GIZA++ trained IBM alignment models without sacrificing translation accuracy and outperforms previous attempts on Transformer model based word alignment. Finally, by incorporating IBM model alignments into our multi-task training, we report significantly better alignment accuracies compared to GIZA++ on three publicly available data sets. Our implementation has been open-sourced 1 .

show abstract

“…Our work is closely related to recent work on injecting prior knowledge into NMT (Arthur et al, 2016;Cohn et al, 2016;Tang et al, 2016;Feng et al, 2016;. The major difference is that our approach aims to provide a general framework for incorporating arbitrary prior knowledge sources while keeping the neural translation model unchanged.…”

Section: Related Workmentioning

confidence: 97%

“…It is natural to leverage a bilingual dictionary D to improve neural machine translation. Arthur et al (2016) propose to incorporate discrete translation lexicons into NMT by using the attention vector to select lexical probabilities on which to be focused.…”

Section: Bilingual Dictionarymentioning

confidence: 99%

Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization

Zhang¹,

Liu²,

Luan³

et al. 2017

Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 1: Long Papers)

View full text Add to dashboard Cite

Although neural machine translation has made significant progress recently, how to integrate multiple overlapping, arbitrary prior knowledge sources remains a challenge. In this work, we propose to use posterior regularization to provide a general framework for integrating prior knowledge into neural machine translation. We represent prior knowledge sources as features in a log-linear model, which guides the learning process of the neural translation model. Experiments on ChineseEnglish translation show that our approach leads to significant improvements.

show abstract

Incorporating Discrete Translation Lexicons into Neural Machine Translation

Cited by 165 publications

References 29 publications

Six Challenges for Neural Machine Translation

Six Challenges for Neural Machine Translation

Jointly Learning to Align and Translate with Transformer Models

Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization

Contact Info

Product

Resources

About