Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 2018
DOI: 10.18653/v1/d18-1335
|View full text |Cite
|
Sign up to set email alerts
|

Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation

Abstract: This work investigates an alternative model for neural machine translation (NMT) and proposes a novel architecture, where we employ a multi-dimensional long short-term memory (MDLSTM) for translation modeling. In the state-of-the-art methods, source and target sentences are treated as one-dimensional sequences over time, while we view translation as a two-dimensional (2D) mapping using an MDLSTM layer to define the correspondence between source and target words. We extend beyond the current sequence to sequenc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 25 publications
(25 citation statements)
references
References 16 publications
0
25
0
Order By: Relevance
“…When an LSTM cell is recursively utilized in a 1D array As shown in Fig. 5, a 2D LSTM network can be realized when the LSTM cell is recursively in a 2D mesh form [13]. Each LSTM cell utilizes the hidden and cell states from the two neighboring cells in the left and below positions in the mesh, and its states are delivered to its neighboring cells in the right and top positions.…”
Section: B Framework Architecture and Methodsmentioning
confidence: 99%
“…When an LSTM cell is recursively utilized in a 1D array As shown in Fig. 5, a 2D LSTM network can be realized when the LSTM cell is recursively in a 2D mesh form [13]. Each LSTM cell utilizes the hidden and cell states from the two neighboring cells in the left and below positions in the mesh, and its states are delivered to its neighboring cells in the right and top positions.…”
Section: B Framework Architecture and Methodsmentioning
confidence: 99%
“…As given in Fig. 5, 2D LSTM network can be realized when the LSTM cell is recursively in a 2D mesh form [13]. Each LSTM cell utilizes the hidden and cell states from the two neighboring cells in the left and below positions in the mesh.…”
Section: B Framework Architecture and Methodsmentioning
confidence: 99%
“…The 2D alternating RNN is a novel translation architecture in development by the MLLP group. This architecture approaches the machine translation problem with a two-dimensional view, much in the same manner as Kalchbrenner et al (2015); Bahar et al (2018) and Elbayad et al (2018). This view is based on the premise that translation is fundamentally a two-dimensional problem, where each word of the target sentence can be explained in some way by all the words in the source sentence.…”
Section: D Alternating Rnnmentioning
confidence: 99%