Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing 2016
DOI: 10.18653/v1/d16-1162
|View full text |Cite
|
Sign up to set email alerts
|

Incorporating Discrete Translation Lexicons into Neural Machine Translation

Abstract: Neural machine translation (NMT) often makes mistakes in translating low-frequency content words that are essential to understanding the meaning of the sentence. We propose a method to alleviate this problem by augmenting NMT systems with discrete translation lexicons that efficiently encode translations of these low-frequency words. We describe a method to calculate the lexicon probability of the next word in the translation candidate by using the attention vector of the NMT model to select which source word … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
140
0
1

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 165 publications
(142 citation statements)
references
References 29 publications
1
140
0
1
Order By: Relevance
“…For instance, prior work used the alignments provided by the attention model to interpolate word translation decisions with traditional probabilistic dictionaries (Arthur et al, 2016), for the introduction of coverage and fertility models (Tu et al, 2016), etc. But is the attention model in fact the proper means? To examine this, we compare the soft alignment matrix (the sequence of attention vectors) with word alignments obtained by traditional word alignment methods.…”
Section: Word Alignmentmentioning
confidence: 99%
“…For instance, prior work used the alignments provided by the attention model to interpolate word translation decisions with traditional probabilistic dictionaries (Arthur et al, 2016), for the introduction of coverage and fertility models (Tu et al, 2016), etc. But is the attention model in fact the proper means? To examine this, we compare the soft alignment matrix (the sequence of attention vectors) with word alignments obtained by traditional word alignment methods.…”
Section: Word Alignmentmentioning
confidence: 99%
“…A straightforward application of word alignments is to generate bilingual lexica from parallel corpora. Word alignments have also been used for external dictionary assisted translation (Chatterjee et al, 2017;Alkhouli et al, 2018;Arthur et al, 2016) to improve translation of low frequency words or to comply with certain terminology guidelines. Documents and webpages often contain word annotations such as formatting styles and hyperlinks, which need to be preserved in the translation.…”
Section: Introductionmentioning
confidence: 99%
“…Our work is closely related to recent work on injecting prior knowledge into NMT (Arthur et al, 2016;Cohn et al, 2016;Tang et al, 2016;Feng et al, 2016;. The major difference is that our approach aims to provide a general framework for incorporating arbitrary prior knowledge sources while keeping the neural translation model unchanged.…”
Section: Related Workmentioning
confidence: 97%
“…It is natural to leverage a bilingual dictionary D to improve neural machine translation. Arthur et al (2016) propose to incorporate discrete translation lexicons into NMT by using the attention vector to select lexical probabilities on which to be focused.…”
Section: Bilingual Dictionarymentioning
confidence: 99%