A Survey on Document-level Neural Machine Translation

Maruf, Sameen; Saleh, Fahimeh; Haffari, Gholamreza

doi:10.1145/3441691

Cited by 71 publications

(52 citation statements)

References 96 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Context-aware Machine Translation There have been many works in the literature that try to incorporate context into NMT systems. Tiedemann and Scherrer (2017) first proposed the simple approach of concatenating the previous sentences in both the source and target side to the input to the system; Jean et al (2017), Bawden et al (2018), and used an additional contextspecific encoder to extract contextual features from the previous sentences; Maruf and Haffari (2018) and Tu et al (2018b) For a more detailed overview, Maruf et al (2019b) extensively describe the different approaches and how they leverage context. While these models lead to improvements with small training sets, Lopes et al (2020) showed that the improvements are negligible when compared with the concatenation baseline when using larger datasets.…”

Section: Related Workmentioning

confidence: 99%

Measuring and Increasing Context Usage in Context-Aware Machine Translation

Fernandes

Yin

Neubig

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Recent work in neural machine translation has demonstrated both the necessity and feasibility of using inter-sentential context -context from sentences other than those currently being translated. However, while many current methods present model architectures that theoretically can use this extra context, it is often not clear how much they do actually utilize it at translation time. In this paper, we introduce a new metric, conditional cross-mutual information, to quantify the usage of context by these models. Using this metric, we measure how much document-level machine translation systems use particular varieties of context. We find that target context is referenced more than source context, and that conditioning on a longer context has a diminishing effect on results. We then introduce a new, simple training method, context-aware word dropout, to increase the usage of context by context-aware models. Experiments show that our method increases context usage and that this reflects on the translation quality according to metrics such as BLEU and COMET, as well as performance on anaphoric pronoun resolution and lexical cohesion contrastive datasets. 1

show abstract

Section: Related Workmentioning

confidence: 99%

Measuring and Increasing Context Usage in Context-Aware Machine Translation

Fernandes

Yin

Neubig

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

show abstract

“…According to the length, sentences are divided into three categories: short sentences (1∼9 words), medium-length sentences (10∼25 words), and long sentences (more than 25 words). In the constructed corpora, such as the FLOB Corpus, the average sentence length of the English text is 26.26 words, and in the Brown Corpus, the average sentence length of the English text is 32.48 words, which has reached the definition of the length of long sentences [5]. Taking long news sentences as an example, in the book news reporting and writing, it is considered that short sentences contain insufficient information and are easy to cause ambiguity, while long news sentences are difficult to understand, so it is best to keep the sentence of news introduction within 35 words.…”

Section: Definition Of An English Long Statementmentioning

confidence: 99%

Research on Intelligent Calibration of English Long Sentence Translation Based on Corpus

Qiu

2021

Security and Communication Networks

View full text Add to dashboard Cite

With the continuous promotion and development of the new curriculum reform, English teaching is becoming more practical and comprehensive. As an indispensable part of daily English topics, the use rate and scope of English attributive clauses are extensive. Moreover, due to English attributive provisions, the length of the whole English sentence will inevitably increase; therefore, we can accurately understand and translate sentences by mastering the translation and understanding details of attributive clauses. In addition, there are noticeable differences between English and Chinese in attributive clauses. Chinese will not add a variety of modifiers like English but will directly put them in front of them as attributives, so we should pay attention to this in translation. This process increases the difficulty of the English translation. Therefore, this paper proposes a Corpus-based intelligent calibration of English long sentence translation. Based on the construction of the English long sentence Corpus, an intelligent calibration algorithm for English long sentence translation is designed, and experiments verify the effectiveness of this method.

show abstract

“…We also evaluate the models from the viewpoint of interpolation, which we define as the ability to generate tokens whose lengths are seen during training. Specifically, we evaluate interpolation using long sequences since, first, the generation of long sequences is an important research topic in NLP (Zaheer et al, 2020;Maruf et al, 2021) and second, in datasets with long sequences, the position distribution of each token becomes increasingly sparse. In other words, tokens in the validation and test sets become unlikely to be observed in the training set at corresponding positions; we expect that shift invariance is crucial for addressing such position sparsity.…”

Section: (Iii) Interpolatementioning

confidence: 99%

SHAPE: Shifted Absolute Position Embedding for Transformers

Kiyono

Kobayashi

Suzuki

et al. 2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Position representation is crucial for building position-aware representations in Transformers. Existing position representations suffer from a lack of generalization to test data with unseen lengths or high computational cost. We investigate shifted absolute position embedding (SHAPE) to address both issues. The basic idea of SHAPE is to achieve shift invariance, which is a key property of recent successful position representations, by randomly shifting absolute positions during training. We demonstrate that SHAPE is empirically comparable to its counterpart while being simpler and faster 1 .

show abstract

A Survey on Document-level Neural Machine Translation

Cited by 71 publications

References 96 publications

Measuring and Increasing Context Usage in Context-Aware Machine Translation

Measuring and Increasing Context Usage in Context-Aware Machine Translation

Research on Intelligent Calibration of English Long Sentence Translation Based on Corpus

SHAPE: Shifted Absolute Position Embedding for Transformers

Contact Info

Product

Resources

About