2016
DOI: 10.48550/arxiv.1611.02344
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Convolutional Encoder Model for Neural Machine Translation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
49
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 41 publications
(50 citation statements)
references
References 0 publications
0
49
0
Order By: Relevance
“…CNNs are good at processing data that has a grid-like topology. Two-dimensional CNNs achieve great success in computer vision [29,30,31,32], while one-dimensional CNNs are commonly used for sequential data [33,34,35]. Among these models, TCNs which use causal convolutions with skewed connections attempt to capture the temporal interactions and have been applied to various regression tasks, such as action segmentation and detection [36,37], lip-reading [38,39], and ENSO prediction [40].…”
Section: Related Workmentioning
confidence: 99%
“…CNNs are good at processing data that has a grid-like topology. Two-dimensional CNNs achieve great success in computer vision [29,30,31,32], while one-dimensional CNNs are commonly used for sequential data [33,34,35]. Among these models, TCNs which use causal convolutions with skewed connections attempt to capture the temporal interactions and have been applied to various regression tasks, such as action segmentation and detection [36,37], lip-reading [38,39], and ENSO prediction [40].…”
Section: Related Workmentioning
confidence: 99%
“…At prediction time, either greedy or beam search is used to generate the target sentence from left to right. Various architectures have been proposed to improve the quality of neural machine translation.This involves recurrent networks [3], convolutional networks [9] and transformer networks [28]. Attention has shown great help for these neural architectures, which includes self-attention [26], multi-hop attention [12] and multi-head attention [2].…”
Section: Related Workmentioning
confidence: 99%
“…Unlike recurrent networks, CNN enables parallelization and faster processing. Encoder-decoder models using CNN were proved effective in translating phrases in the source sentence to suitable target sentences [6,7]. CNN based NMT models could not, however, match the performance of the state of the art in recurrent neural network based NMT models [3].…”
Section: Introductionmentioning
confidence: 99%