2019
DOI: 10.3390/app9102036
|View full text |Cite
|
Sign up to set email alerts
|

Corpus Augmentation for Neural Machine Translation with Chinese-Japanese Parallel Corpora

Abstract: The translation quality of Neural Machine Translation (NMT) systems depends strongly on the training data size. Sufficient amounts of parallel data are, however, not available for many language pairs. This paper presents a corpus augmentation method, which has two variations: one is for all language pairs, and the other is for the Chinese-Japanese language pair. The method uses both source and target sentences of the existing parallel corpus and generates multiple pseudo-parallel sentence pairs from a long par… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
22
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8
1
1

Relationship

1
9

Authors

Journals

citations
Cited by 27 publications
(22 citation statements)
references
References 17 publications
0
22
0
Order By: Relevance
“…Taking into account the types of English songs used in this article, the lyrics of the songs include English and other different types of languages. In order to ensure the compatibility of the system with multiple languages, this article cuts the words of each character in English songs according to special characters, which guarantees the versatility and fluency of the word-slicing algorithm [32][33][34][35]. e word vector expresses the adaptive expression method of dynamic lyrics recognition and segmentation.…”
Section: Word Vector Pre-representation Of Lyrics Textmentioning
confidence: 99%
“…Taking into account the types of English songs used in this article, the lyrics of the songs include English and other different types of languages. In order to ensure the compatibility of the system with multiple languages, this article cuts the words of each character in English songs according to special characters, which guarantees the versatility and fluency of the word-slicing algorithm [32][33][34][35]. e word vector expresses the adaptive expression method of dynamic lyrics recognition and segmentation.…”
Section: Word Vector Pre-representation Of Lyrics Textmentioning
confidence: 99%
“…However, it does not solve the problems of the single acquisition method of data resources in the existing marketing strategy, the acquisition of fewer resources, and the inability to achieve deep learning processing of data resources, and the setting and placement of advertisements based on data resources. For this reason, we propose a marketing strategy based on deep machine learning algorithms [24][25][26][27][28].…”
Section: Introductionmentioning
confidence: 99%
“…e application of deep learning theory can better solve these problems of statistical machine translation. Existing research methods mainly include two kinds: one uses deep learning technology to improve key modules and built the model to achieve direct source language to target language [23][24][25][26][27].…”
Section: Introductionmentioning
confidence: 99%