Proceedings of the 22nd Conference on Computational Natural Language Learning 2018
DOI: 10.18653/v1/k18-1010
|View full text |Cite
|
Sign up to set email alerts
|

Pervasive Attention: 2

Abstract: Current state-of-the-art machine translation systems are based on encoder-decoder architectures, that first encode the input sequence, and then generate an output sequence based on the input encoding. Both are interfaced with an attention mechanism that recombines a fixed encoding of the source tokens based on the decoder state. We propose an alternative approach which instead relies on a single 2D convolutional neural network across both sequences. Each layer of our network recodes source tokens on the basis … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 22 publications
(6 citation statements)
references
References 18 publications
(21 reference statements)
0
5
0
Order By: Relevance
“…Although researchers have proposed various new NMT architecture, they usually evaluate their models only in terms of the overall translation quality and rarely mention how the translation has changed (Gehring et al, 2017;Kalchbrenner et al, 2016;Vaswani et al, 2017). Only a few studies do the analysis on the translation quality in terms of sentence length (Elbayad et al, 2018;Zhang et al, 2019). The robustness of the recent NMT models on very long sentences remains to be assessed.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Although researchers have proposed various new NMT architecture, they usually evaluate their models only in terms of the overall translation quality and rarely mention how the translation has changed (Gehring et al, 2017;Kalchbrenner et al, 2016;Vaswani et al, 2017). Only a few studies do the analysis on the translation quality in terms of sentence length (Elbayad et al, 2018;Zhang et al, 2019). The robustness of the recent NMT models on very long sentences remains to be assessed.…”
Section: Related Workmentioning
confidence: 99%
“…Only a few studies do the analysis on the translation quality in terms of sentence length (Elbayad et al, 2018;Zhang et al, 2019). The robustness of the recent NMT models on very long sentences remains to be assessed.…”
Section: Related Workmentioning
confidence: 99%
“…The use of LSTMs may also be of concern as there are newer methods emerging in the domain (Bai et al, 2018;Elbayad et al, 2018). We proceeded with the use of LSTMs due to their simplicity, performance, and ability to run quickly on a CPU.…”
Section: Limitations and Future Workmentioning
confidence: 99%
“…So, multiple frames must be processed to find the diffusion coefficient. There are various ways of processing multiple frames such as using LSTMs with CNNs [33] (that can be difficult to train and parallelize [34]) and CNNs with 3D convolutions (that have a higher number of parameters as compared to 2D convolutions). These architectural elements could be an acceptable choice, but for the current problem, the input space can be simplified further.…”
Section: Deep Particle Diffusometrymentioning
confidence: 99%