Efficient Context-Aware Neural Machine Translation with Layer-Wise Weighting and Input-Aware Gating

Xu, Hongfei; Xiong, Deyi; Genabith, Josef van; Liu, Qiuhui

doi:10.24963/ijcai.2020/544

“…Researchers propose various context-aware networks to utilize contextual information to improve the performance of DocNMT models on the translation quality (Jean et al, 2017;Tu et al, 2018;Kuang et al, 2018) or discourse phenomena (Bawden et al, 2018;Voita et al, 2019b,a). However, most methods roughly leverage all context sentences in a fixed size that is tuned on development sets (Wang et al, 2017;Miculicich et al, 2018;Yang et al, 2019;Xu et al, 2020) , or full context in the entire document (Maruf and Haffari, 2018;Tan et al, 2019;Kang and Zong, 2020;Zheng et al, 2020). They ignore the individualized needs for context when translating different source sentences.…”

Section: Related Workmentioning

confidence: 99%

“…Majority of existing DocNMT models set the context size or scope to be fixed. They utilize all of the previous k context sentences Miculicich et al, 2018;Voita et al, 2019b;Yang et al, 2019;Xu et al, 2020), or the full context in the entire document (Maruf and Haffari, 2018;Tan et al, 2019;Zheng et al, 2020). As a result, the inadequacy or redundancy of contextual information is almost inevitable.…”

Section: Introductionmentioning

confidence: 99%

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

Kang¹,

Zhao²,

Zhang³

2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

Document-level neural machine translation has yielded attractive improvements. However, majority of existing methods roughly use all context sentences in a fixed scope. They neglect the fact that different source sentences need different sizes of context. To address this problem, we propose an effective approach to select dynamic context so that the document-level translation model can utilize the more useful selected context sentences to produce better translations. Specifically, we introduce a selection module that is independent of the translation module to score each candidate context sentence. Then, we propose two strategies to explicitly select a variable number of context sentences and feed them into the translation module. We train the two modules end-to-end via reinforcement learning. A novel reward is proposed to encourage the selection and utilization of dynamic context sentences. Experiments demonstrate that our approach can select adaptive context sentences for different source sentences, and significantly improves the performance of document-level translation methods.

show abstract

“…Many methods have been proposed to improve document-level neural machine translation (DNMT). Among them, the mainstream studies * * Work was done while at ByteDance focus on the model architecture modification, including hierarchical attention (Wang et al, 2017;Miculicich et al, 2018;Tan et al, 2019), additional context extraction encoders or query layers (Jean et al, 2017;Bawden et al, 2017;Maruf et al, 2019;Jiang et al, 2019;Zheng et al, 2020;Yun et al, 2020;Xu et al, 2020), and cache-like memory network (Maruf and Haffari, 2018;Tu et al, 2018).…”

Section: Introductionmentioning

confidence: 99%

Rethinking Document-level Neural Machine Translation

Sun¹,

Wang²,

Zhang³

et al. 2022

Findings of the Association for Computational Linguistics: ACL 2022

View full text Add to dashboard Cite

This paper does not aim at introducing a novel model for document-level neural machine translation. Instead, we head back to the original Transformer model and hope to answer the following question: Is the capacity of current models strong enough for documentlevel translation? Interestingly, we observe that the original Transformer with appropriate training techniques can achieve strong results for document translation, even with a length of 2000 words. We evaluate this model and several recent approaches on nine documentlevel datasets and two sentence-level datasets across six languages. Experiments show that document-level Transformer models outperforms sentence-level ones and many previous methods in a comprehensive set of metrics, including BLEU, four lexical indices, three newly proposed assistant linguistic indicators, and human evaluation. Our new datasets and evaluation scripts are in https://github. com/sunzewei2715/Doc2Doc_NMT.

show abstract

“…Researchers propose various context-aware networks to utilize contextual information to improve the performance of DocNMT models on the translation quality (Jean et al, 2017;Tu et al, 2018;Kuang et al, 2018) or discourse phenomena (Bawden et al, 2018;Voita et al, 2019b,a). However, most methods roughly leverage all context sentences in a fixed size that is tuned on development sets (Wang et al, 2017;Miculicich et al, 2018;Yang et al, 2019;Xu et al, 2020) , or full context in the entire document (Maruf and Haffari, 2018;Tan et al, 2019;Kang and Zong, 2020;Zheng et al, 2020). They ignore the individualized needs for context when translating different source sentences.…”

Section: Related Workmentioning

confidence: 99%

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

Kang¹,

Zhao²,

Zhang³

et al. 2020

Preprint

View full text Add to dashboard Cite

Document-level neural machine translation has yielded attractive improvements. However, majority of existing methods roughly use all context sentences in a fixed scope. They neglect the fact that different source sentences need different sizes of context. To address this problem, we propose an effective approach to select dynamic context so that the document-level translation model can utilize the more useful selected context sentences to produce better translations. Specifically, we introduce a selection module that is independent of the translation module to score each candidate context sentence. Then, we propose two strategies to explicitly select a variable number of context sentences and feed them into the translation module. We train the two modules end-to-end via reinforcement learning. A novel reward is proposed to encourage the selection and utilization of dynamic context sentences. Experiments demonstrate that our approach can select adaptive context sentences for different source sentences, and significantly improves the performance of document-level translation methods.

show abstract

Efficient Context-Aware Neural Machine Translation with Layer-Wise Weighting and Input-Aware Gating

Cited by 9 publications

References 12 publications

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

Rethinking Document-level Neural Machine Translation

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

Contact Info

Product

Resources

About