Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers 2016
DOI: 10.18653/v1/w16-2206
|View full text |Cite
|
Sign up to set email alerts
|

Alignment-Based Neural Machine Translation

Abstract: Neural machine translation (NMT) has emerged recently as a promising statistical machine translation approach. In NMT, neural networks (NN) are directly used to produce translations, without relying on a pre-existing translation framework. In this work, we take a step towards bridging the gap between conventional word alignment models and NMT. We follow the hidden Markov model (HMM) approach that separates the alignment and lexical models. We propose a neural alignment model and combine it with a lexical neura… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
42
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
2
2

Relationship

3
6

Authors

Journals

citations
Cited by 37 publications
(42 citation statements)
references
References 25 publications
0
42
0
Order By: Relevance
“…W m projects the features of x k to a relatively low dimension for reducing computational overhead, and W m projects the aggregated features back to the same dimension as y q . 1 For 2-d image data, we separately encode the x-axis relative position R X k−q and y-axis relative position R Y k−q , and concatenate them to be the final encoding…”
Section: Transformer Attentionmentioning
confidence: 99%
“…W m projects the features of x k to a relatively low dimension for reducing computational overhead, and W m projects the aggregated features back to the same dimension as y q . 1 For 2-d image data, we separately encode the x-axis relative position R X k−q and y-axis relative position R Y k−q , and concatenate them to be the final encoding…”
Section: Transformer Attentionmentioning
confidence: 99%
“…(2). Following (Alkhouli et al, 2016), the alignment model predicts the relative jump ∆ i = b i − b i−1 from the previous source position b i−1 to the current source position b i . This model has a bidirectional source encoder consisting of two recurrent layers (yellow), and a recurrent layer maintaining the target state (red).…”
Section: Recurrent Alignment Modelmentioning
confidence: 99%
“…Nowadays, HMM is used with IBM models to generate word alignments, which are needed to train phrase-based systems. Alkhouli et al (2016) and Wang et al (2017) apply the hidden Markov model decomposition using feedforward lexical and alignment neural network models. In this work, we are interested in using more expressive models.…”
Section: Introductionmentioning
confidence: 99%
“…Our feed-forward alignment model has the same architecture ( Figure 1) as the one proposed in (Alkhouli et al, 2016). Thus the alignment probability can be modeled by:…”
Section: Definition Of Neural Network-based Hmmmentioning
confidence: 99%