2016
DOI: 10.48550/arxiv.1609.04186
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Neural Machine Translation with Supervised Attention

Lemao Liu,
Masao Utiyama,
Andrew Finch
et al.

Abstract: The attention mechanisim is appealing for neural machine translation, since it is able to dynamically encode a source sentence by generating a alignment between a target word and source words. Unfortunately, it has been proved to be worse than conventional alignment models in aligment accuracy. In this paper, we analyze and explain this issue from the point view of reordering, and propose a supervised attention which is learned with guidance from conventional alignment models. Experiments on two Chinese-to-Eng… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 10 publications
0
7
0
Order By: Relevance
“…Lastly, the deliberate training of attention weights has been studied in several papers in which the goal is not to study the explanatory power of attention weights but rather to achieve better predictive performance by introducing an additional source of supervision. In some of these papers, attention weights are guided by known word alignments in machine translation (Liu et al, 2016;Chen et al, 2016), or aligning human eyegaze with model's attention for sequence classification (Barrett et al, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…Lastly, the deliberate training of attention weights has been studied in several papers in which the goal is not to study the explanatory power of attention weights but rather to achieve better predictive performance by introducing an additional source of supervision. In some of these papers, attention weights are guided by known word alignments in machine translation (Liu et al, 2016;Chen et al, 2016), or aligning human eyegaze with model's attention for sequence classification (Barrett et al, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…This research has largely been conducted by visualizing or analyzing the learned attention weights of a whole attention module on only NLP tasks [17,40,18,25]. Many works [17,40,18] suggest that attention weight assignment in encoder-decoder attention plays a role similar to word alignment in traditional approaches [1,8,30,6]. The implicit underlying assumption in these works is that the input elements accorded high attention weights are responsible for the model outputs.…”
Section: Analysis Of Spatial Attention Mechanismsmentioning
confidence: 99%
“…Nevertheless, further refining attention by extra supervision has been shown to be beneficial. Examples include using word alignments to learn attention in neural machine translation (Liu et al, 2016), employing argument words to supervise attention in event detection (Liu et al, 2017), utilizing linguisticallymotivated annotations to guide attention in constituency parsing (Kamigaito et al, 2017). These supervision mechanisms are tailored to specific applications.…”
Section: Related Workmentioning
confidence: 99%