Alignment-Enhanced Transformer for Constraining NMT with Pre-Specified Translations

Song, Kai; Wang, Kun; Yu, Heng; Zhang, Yue; Huang, Zhongqiang; Luo, Weihua; Duan, Xiangyu; Zhang, Min

doi:10.1609/aaai.v34i05.6418

Cited by 39 publications

(44 citation statements)

References 16 publications

(37 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition to AER, we compare the performance of NAIVE-ATT, SHIFT-ATT and SHIFT-AET on dictionary-guided machine translation (Song et al, 2020), which is an alignment-based downstream task. Given source and target constraint pairs from dictionary, the NMT model is encouraged to translate with provided constraints via word alignments (Alkhouli et al, 2018;Hasler et al, 2018;Hokamp and Liu, 2017;Song et al, 2020). More specifically, at each decoding step, the last token of the candidate translation will be revised with target constraint if it is aligned to the corresponding source constraint according to the alignment induction method.…”

Section: Downstream Task Resultsmentioning

confidence: 99%

Accurate Word Alignment Induction from Neural Machine Translation

Chen¹,

Liu²,

Chen³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

Despite its original goal to jointly learn to align and translate, prior researches suggest that Transformer captures poor word alignments through its attention mechanism. In this paper, we show that attention weights DO capture accurate word alignments and propose two novel word alignment induction methods SHIFT-ATT and SHIFT-AET. The main idea is to induce alignments at the step when the to-be-aligned target token is the decoder input rather than the decoder output as in previous work. SHIFT-ATT is an interpretation method that induces alignments from the attention weights of Transformer and does not require parameter update or architecture change. SHIFT-AET extracts alignments from an additional alignment module which is tightly integrated into Transformer and trained in isolation with supervision from symmetrized SHIFT-ATT alignments. Experiments on three publicly available datasets demonstrate that both methods perform better than their corresponding neural baselines and SHIFT-AET significantly outperforms GIZA++ by 1.4-4.8 AER points. 1 * Corresponding author. Part of the work was done when Yun was in Huawei Noah's Ark Lab.1 Code can be found at https://github.com/ sufe-nlp/transformer-alignment.Source: das weiß ich . Dec. input: i understand this . Dec. output: i understand this .

show abstract

Section: Downstream Task Resultsmentioning

confidence: 99%

Accurate Word Alignment Induction from Neural Machine Translation

Chen¹,

Liu²,

Chen³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

show abstract

“…For downstream tasks, word alignment can be used to improve dictionary-guided NMT (Song et al, 2020;Chen et al, 2020). Specifically, at each decoding step in NMT, Chen et al ( 2020) used a SHIFT-AET method to compute word alignment for the newly generated target word and then revised the newly generated target word by encouraging the pre-specified translation from the dictionary.…”

Section: Dictionary-guided Nmt Via Word Alignmentmentioning

confidence: 99%

A Bidirectional Transformer Based Alignment Model for Unsupervised Word Alignment

Zhang¹,

Genabith²

2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Word alignment and machine translation are two closely related tasks. Neural translation models, such as RNN-based and Transformer models, employ a target-to-source attention mechanism which can provide rough word alignments, but with a rather low accuracy. High-quality word alignment can help neural machine translation in many different ways, such as missing word detection, annotation transfer and lexicon injection. Existing methods for learning word alignment include statistical word aligners (e.g. GIZA++) and recently neural word alignment models. This paper presents a bidirectional Transformer based alignment (BTBA) model for unsupervised learning of the word alignment task. Our BTBA model predicts the current target word by attending the source context and both leftside and right-side target context to produce accurate target-to-source attention (alignment). We further fine-tune the target-to-source attention in the BTBA model to obtain better alignments using a full context based optimization method and self-supervised training. We test our method on three word alignment tasks and show that our method outperforms both previous neural word alignment approaches and the popular statistical word aligner GIZA++.

show abstract

“…Previously, some NMT with terminology constraints have been studied (Hasler et al, 2018;Alkhouli et al, 2018;Dinu et al, 2019;Chen et al, 2020;Song et al, 2020). For example, Song et al (2020) proposed a dedicated head in a multi-head Transformer architecture to learn explicit word alignment and use it to guide the constrained decoding process. When the source-aligned word matches a dictionary, the model outputs the corresponding target word.…”

Section: Related Workmentioning

confidence: 99%

“…Since the emergence of neural machine translation (NMT) models (Sutskever et al, 2014;Bahdanau et al, 2015;Vaswani et al, 2017), several studies have been conducted to explore NMT systems capable of decoding translations under terminological constraints (Hasler et al, 2018;Dinu et al, 2019;Chen et al, 2020;Song et al, 2020). However, these previous studies were conducted under the condition that a bilingual dictionary is given.…”

Section: Introductionmentioning

confidence: 99%

Machine Translation with Pre-specified Target-side Words Using a Semi-autoregressive Model

Kondo¹,

Koyama²,

Kiyuna³

et al. 2021

Proceedings of the 8th Workshop on Asian Translation (WAT2021)

View full text Add to dashboard Cite

We introduce our TMU Japanese-to-English system, which employs a semi-autoregressive model, to tackle the WAT 2021 (Nakazawa et al., 2021) restricted translation task. In this task, we translate an input sentence with the constraint that some words, called restricted target vocabularies (RTVs), must be contained in the output sentence. To satisfy this constraint, we use a semi-autoregressive model, namely, RecoverSAT (Ran et al., 2020), due to its ability (known as "forced translation") to insert specified words into the output sentence. When using "forced translation," the order of inserting RTVs is a critical problem. In our system, we obtain word alignment between a source sentence and the corresponding RTVs and then sort the RTVs in the order of their corresponding words or phrases in the source sentence. Using the model with sorted order RTVs, we succeeded in inserting all the RTVs into output sentences in more than 96% of the test sentences. Moreover, we confirmed that sorting RTVs improved the BLEU score compared with random order RTVs.

show abstract

Alignment-Enhanced Transformer for Constraining NMT with Pre-Specified Translations

Cited by 39 publications

References 16 publications

Accurate Word Alignment Induction from Neural Machine Translation

Accurate Word Alignment Induction from Neural Machine Translation

A Bidirectional Transformer Based Alignment Model for Unsupervised Word Alignment

Machine Translation with Pre-specified Target-side Words Using a Semi-autoregressive Model

Contact Info

Product

Resources

About