Enhancing Context Modeling with a Query-Guided Capsule Network for Document-level Translation

Yang, Zhengxin; Zhang, Jinchao; Meng, Fandong; Gu, Shuhao; Feng, Yang; Zhou, Jie

doi:10.18653/v1/d19-1164

Cited by 46 publications

(24 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Researchers propose various context-aware networks to utilize contextual information to improve the performance of DocNMT models on the translation quality (Jean et al, 2017;Tu et al, 2018;Kuang et al, 2018) or discourse phenomena (Bawden et al, 2018;Voita et al, 2019b,a). However, most methods roughly leverage all context sentences in a fixed size that is tuned on development sets (Wang et al, 2017;Miculicich et al, 2018;Yang et al, 2019;Xu et al, 2020) , or full context in the entire document (Maruf and Haffari, 2018;Tan et al, 2019;Kang and Zong, 2020;Zheng et al, 2020). They ignore the individualized needs for context when translating different source sentences.…”

Section: Related Workmentioning

confidence: 99%

“…Majority of existing DocNMT models set the context size or scope to be fixed. They utilize all of the previous k context sentences Miculicich et al, 2018;Voita et al, 2019b;Yang et al, 2019;Xu et al, 2020), or the full context in the entire document (Maruf and Haffari, 2018;Tan et al, 2019;Zheng et al, 2020). As a result, the inadequacy or redundancy of contextual information is almost inevitable.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

Kang¹,

Zhao²,

Zhang³

2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

Document-level neural machine translation has yielded attractive improvements. However, majority of existing methods roughly use all context sentences in a fixed scope. They neglect the fact that different source sentences need different sizes of context. To address this problem, we propose an effective approach to select dynamic context so that the document-level translation model can utilize the more useful selected context sentences to produce better translations. Specifically, we introduce a selection module that is independent of the translation module to score each candidate context sentence. Then, we propose two strategies to explicitly select a variable number of context sentences and feed them into the translation module. We train the two modules end-to-end via reinforcement learning. A novel reward is proposed to encourage the selection and utilization of dynamic context sentences. Experiments demonstrate that our approach can select adaptive context sentences for different source sentences, and significantly improves the performance of document-level translation methods.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

Kang¹,

Zhao²,

Zhang³

2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

show abstract

“…Tu et al (2018) augments translation model with a cache-like memory network that stores recent hidden representations as translation history. Yang et al (2019) introduce a query-guided capsule networks into document-level translation to capture high-level capsules related to the current source sentence. proposes a unified encoder to process the concatenated source information that only attends to the source sentence at the top of encoder blocks.…”

Section: Related Workmentioning

confidence: 99%

“…Transformer (Vaswani et al, 2017) performs context-agnostic sent-level translation and HAN (Werlen et al, 2018) employs hierarchical attention to capture extra contexts. SAN (Maruf et al, 2019) utilizes top-down attention to selectively focus on relevant sentences and QCN (Yang et al, 2019) uses query-guided capsule networks to capture the related capsulese.…”

Section: Baselinesmentioning

confidence: 99%

Context-Interactive Pre-Training for Document Machine Translation

Yang¹,

Zhang²,

Chen³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Document machine translation aims to translate the source sentence into the target language in the presence of additional contextual information. However, it typically suffers from a lack of doc-level bilingual data. To remedy this, here we propose a simple yet effective context-interactive pre-training approach, which targets benefiting from external largescale corpora. The proposed model performs inter sentence generation to capture the crosssentence dependency within the target document, and cross sentence translation to make better use of valuable contextual information. Comprehensive experiments illustrate that our approach can achieve state-of-the-art performance on three benchmark datasets, which significantly outperforms a variety of baselines.

show abstract

“…Cache/Memory-based approaches (Tu et al, 2018;Kuang et al, 2018;Maruf and Haffari, 2018;Wang et al, 2017) store word/sentence translation in previous sentences for future sentence translation. Various approaches with an extra context encoders are proposed to model either local context, e.g., previous sentences Wang et al, 2017;Bawden et al, 2018;Voita et al, 2018Voita et al, , 2019bYang et al, 2019;Huo et al, 2020), or entire document (Maruf and Haffari, 2018;Mace and Servan, 2019;Maruf et al, 2019;Tan et al, 2019;Zheng et al, 2020;Kang et al, 2020).…”

Section: Context-aware Nmtmentioning

confidence: 99%

Breaking the Corpus Bottleneck for Context-Aware Neural Machine Translation with Cross-Task Pre-training

Chen¹,

Li²,

Gong³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Context-aware neural machine translation (NMT) remains challenging due to the lack of large-scale document-level parallel dataset. To break the corpus bottleneck, in this paper we aim to improve context-aware NMT by taking the advantage of the availability of both large-scale sentence-level parallel dataset and source-side monolingual documents. 1 To this end, we propose two pre-training tasks. One learns to translate a sentence from source language to target language on the sentencelevel parallel dataset while the other learns to translate a document from deliberately noised to original on the monolingual documents. Importantly, the two pre-training tasks are jointly and simultaneously learned via the same model, thereafter fine-tuned on scalelimited parallel documents from both sentencelevel and document-level perspectives. Experimental results on four translation tasks show that our approach significantly improves translation performance. One nice property of our approach is that the fine-tuned model can be used to translate both sentences and documents.

show abstract

Enhancing Context Modeling with a Query-Guided Capsule Network for Document-level Translation

Cited by 46 publications

References 25 publications

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning

Context-Interactive Pre-Training for Document Machine Translation

Breaking the Corpus Bottleneck for Context-Aware Neural Machine Translation with Cross-Task Pre-training

Contact Info

Product

Resources

About