Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2016
DOI: 10.18653/v1/p16-1078
|View full text |Cite
|
Sign up to set email alerts
|

Tree-to-Sequence Attentional Neural Machine Translation

Abstract: Most of the existing Neural Machine Translation (NMT) models focus on the conversion of sequential data and do not directly use syntactic information. We propose a novel end-to-end syntactic NMT model, extending a sequenceto-sequence model with the source-side phrase structure. Our model has an attention mechanism that enables the decoder to generate a translated word while softly aligning it with phrases as well as words of the source sentence. Experimental results on the WAT'15 Englishto-Japanese dataset dem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
193
0
1

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 226 publications
(200 citation statements)
references
References 19 publications
2
193
0
1
Order By: Relevance
“…Some effort has been done to incorporate source syntax into NMT. Eriguchi et al (2016) proposed a tree-to-sequence attentional NMT model where source-side parse tree was used and achieved promising improvement. Intuitively, adding source syntactic information to [Source] 只有 施工 人员 的 安全 得到 了 保证 , 才能 继续 施 工 .…”
Section: Related Workmentioning
confidence: 99%
“…Some effort has been done to incorporate source syntax into NMT. Eriguchi et al (2016) proposed a tree-to-sequence attentional NMT model where source-side parse tree was used and achieved promising improvement. Intuitively, adding source syntactic information to [Source] 只有 施工 人员 的 安全 得到 了 保证 , 才能 继续 施 工 .…”
Section: Related Workmentioning
confidence: 99%
“…For example, Part-Of-Speech (POS) tags are used for syntactic parsers. The parsers are used to improve higher-level tasks, such as natural language inference (Chen et al, 2016) and machine translation (Eriguchi et al, 2016). These systems are often pipelines and not trained end-to-end.…”
Section: Introductionmentioning
confidence: 99%
“…The experimental results show that our best model outperforms the best single NMT model reported in WAT '16 (Eriguchi et al, 2016b).…”
Section: Introductionmentioning
confidence: 84%
“…Eriguchi et al (2016a)'s baseline system (the first line in Table 3) was the best single (w/o ensembling) word-based NMT system that were reported in WAT '16. For a more fair evaluation, we also reimplemented a standard attention-based NMT system that uses exactly the same encoder, training procedure, and the hyperparameters as our proposed models, but has a word-based decoder.…”
Section: Baseline Systemsmentioning
confidence: 99%
See 1 more Smart Citation