Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence 2020
DOI: 10.24963/ijcai.2020/544
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Context-Aware Neural Machine Translation with Layer-Wise Weighting and Input-Aware Gating

Abstract: Existing Neural Machine Translation (NMT) systems are generally trained on a large amount of sentence-level parallel data, and during prediction sentences are independently translated, ignoring cross-sentence contextual information. This leads to inconsistency between translated sentences. In order to address this issue, context-aware models have been proposed. However, document-level parallel data constitutes only a small part of the parallel data available, and many approaches build context-aware mod… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
13
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(13 citation statements)
references
References 12 publications
0
13
0
Order By: Relevance
“…Researchers propose various context-aware networks to utilize contextual information to improve the performance of DocNMT models on the translation quality (Jean et al, 2017;Tu et al, 2018;Kuang et al, 2018) or discourse phenomena (Bawden et al, 2018;Voita et al, 2019b,a). However, most methods roughly leverage all context sentences in a fixed size that is tuned on development sets (Wang et al, 2017;Miculicich et al, 2018;Yang et al, 2019;Xu et al, 2020) , or full context in the entire document (Maruf and Haffari, 2018;Tan et al, 2019;Kang and Zong, 2020;Zheng et al, 2020). They ignore the individualized needs for context when translating different source sentences.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Researchers propose various context-aware networks to utilize contextual information to improve the performance of DocNMT models on the translation quality (Jean et al, 2017;Tu et al, 2018;Kuang et al, 2018) or discourse phenomena (Bawden et al, 2018;Voita et al, 2019b,a). However, most methods roughly leverage all context sentences in a fixed size that is tuned on development sets (Wang et al, 2017;Miculicich et al, 2018;Yang et al, 2019;Xu et al, 2020) , or full context in the entire document (Maruf and Haffari, 2018;Tan et al, 2019;Kang and Zong, 2020;Zheng et al, 2020). They ignore the individualized needs for context when translating different source sentences.…”
Section: Related Workmentioning
confidence: 99%
“…Majority of existing DocNMT models set the context size or scope to be fixed. They utilize all of the previous k context sentences Miculicich et al, 2018;Voita et al, 2019b;Yang et al, 2019;Xu et al, 2020), or the full context in the entire document (Maruf and Haffari, 2018;Tan et al, 2019;Zheng et al, 2020). As a result, the inadequacy or redundancy of contextual information is almost inevitable.…”
Section: Introductionmentioning
confidence: 99%
“…Many methods have been proposed to improve document-level neural machine translation (DNMT). Among them, the mainstream studies * * Work was done while at ByteDance focus on the model architecture modification, including hierarchical attention (Wang et al, 2017;Miculicich et al, 2018;Tan et al, 2019), additional context extraction encoders or query layers (Jean et al, 2017;Bawden et al, 2017;Maruf et al, 2019;Jiang et al, 2019;Zheng et al, 2020;Yun et al, 2020;Xu et al, 2020), and cache-like memory network (Maruf and Haffari, 2018;Tu et al, 2018).…”
Section: Introductionmentioning
confidence: 99%
“…Researchers propose various context-aware networks to utilize contextual information to improve the performance of DocNMT models on the translation quality (Jean et al, 2017;Tu et al, 2018;Kuang et al, 2018) or discourse phenomena (Bawden et al, 2018;Voita et al, 2019b,a). However, most methods roughly leverage all context sentences in a fixed size that is tuned on development sets (Wang et al, 2017;Miculicich et al, 2018;Yang et al, 2019;Xu et al, 2020) , or full context in the entire document (Maruf and Haffari, 2018;Tan et al, 2019;Kang and Zong, 2020;Zheng et al, 2020). They ignore the individualized needs for context when translating different source sentences.…”
Section: Related Workmentioning
confidence: 99%