On the Relation between Position Information and Sentence Length in Neural Machine Translation

Neishi, Masato; Yoshinaga, Naoki

doi:10.18653/v1/k19-1031

Cited by 36 publications

(42 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They stated that this degradation in quality was caused by the short length of the translations. Additionally, Neishi and Yoshinaga (2019) propose to use the relative position information instead of the absolute position information to mitigate the performance drop of NMT models for long sentences. They conducted an analysis of the translation quality and sentence length on lengthcontrolled English-to-Japanese parallel data and showed that the absolute positional information sharply drops the BLEU score of the transformer model (Vaswani et al, 2017) in translating sentences that are longer than those in the training data.…”

Section: Related Workmentioning

confidence: 99%

Sentence Concatenation Approach to Data Augmentation for Neural Machine Translation

Kondo¹,

Hotate²,

Hirasawa³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Rese

View full text Add to dashboard Cite

Neural machine translation (NMT) has recently gained widespread attention because of its high translation accuracy. However, it shows poor performance in the translation of long sentences, which is a major issue in low-resource languages. It is assumed that this issue is caused by insufficient number of long sentences in the training data. Therefore, this study proposes a simple data augmentation method to handle long sentences. In this method, we use only the given parallel corpora as the training data and generate long sentences by concatenating two sentences. Based on the experimental results, we confirm improvements in long sentence translation by the proposed data augmentation method, despite its simplicity. Moreover, the translation quality is further improved by the proposed method, when combined with backtranslation.

show abstract

Section: Related Workmentioning

confidence: 99%

Sentence Concatenation Approach to Data Augmentation for Neural Machine Translation

Kondo¹,

Hotate²,

Hirasawa³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Rese

View full text Add to dashboard Cite

show abstract

“…They stated that this degradation in quality was caused by the short length of the translations. Additionally, Neishi and Yoshinaga (2019) propose to use the relative position information instead of the absolute position information to mitigate the performance drop of NMT models for long sentences. They conducted an analysis of the translation quality and sentence length on lengthcontrolled English-to-Japanese parallel data and showed that the absolute positional information sharply drops the BLEU score of the transformer model in translating sentences that are longer than those in the training data.…”

Section: Discussionmentioning

confidence: 99%

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

2021

View full text Add to dashboard Cite

show abstract

“…RPE outperforms APE on out-of-distribution data in terms of sequence length owing to its innate shift invariance (Rosendahl et al, 2019;Neishi and Yoshinaga, 2019;Narang et al, 2021;Wang et al, 2021). However, the self-attention mechanism of RPE involves more computation than that of APE 4 .…”

Section: Relative Position Embedding (Rpe)mentioning

confidence: 99%

“…RPE outperforms APE on sequence-to-sequence tasks (Narang et al, 2021;Neishi and Yoshinaga, 2019) due to extrapolation, i.e., the ability to generalize to sequences that are longer than those observed during training (Newman et al, 2020). Wang et al (2021) reported that one of the key properties contributing to RPE's superior performance is shift invariance 2 , the property of a function to not change its output even if its input is shifted.…”

Section: Introductionmentioning

confidence: 99%

SHAPE: Shifted Absolute Position Embedding for Transformers

Kiyono

Kobayashi

Suzuki

et al. 2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Position representation is crucial for building position-aware representations in Transformers. Existing position representations suffer from a lack of generalization to test data with unseen lengths or high computational cost. We investigate shifted absolute position embedding (SHAPE) to address both issues. The basic idea of SHAPE is to achieve shift invariance, which is a key property of recent successful position representations, by randomly shifting absolute positions during training. We demonstrate that SHAPE is empirically comparable to its counterpart while being simpler and faster 1 .

show abstract

On the Relation between Position Information and Sentence Length in Neural Machine Translation

Cited by 36 publications

References 17 publications

Sentence Concatenation Approach to Data Augmentation for Neural Machine Translation

Sentence Concatenation Approach to Data Augmentation for Neural Machine Translation

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

SHAPE: Shifted Absolute Position Embedding for Transformers

Contact Info

Product

Resources

About