2019 International Conference on Asian Language Processing (IALP) 2019
DOI: 10.1109/ialp48816.2019.9037677
|View full text |Cite
|
Sign up to set email alerts
|

Character Decomposition for Japanese-Chinese Character-Level Neural Machine Translation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 12 publications
1
3
0
Order By: Relevance
“…BPE denotes a subword based NMT with a joint vocabulary of 32K BPE tokens † † . Different to NMT for alphabetic languages, we can observe that the BPE approach in Japanese-Chinese NMT is inferior to the baseline using characters, which is consistent with the conclusions in [12], [13]. One reason is that the words in logographic languages are far shorter than words in alphabetic languages, which causes BPE failing to solve the low frequency words' problem.…”
Section: Resultssupporting
confidence: 77%
“…BPE denotes a subword based NMT with a joint vocabulary of 32K BPE tokens † † . Different to NMT for alphabetic languages, we can observe that the BPE approach in Japanese-Chinese NMT is inferior to the baseline using characters, which is consistent with the conclusions in [12], [13]. One reason is that the words in logographic languages are far shorter than words in alphabetic languages, which causes BPE failing to solve the low frequency words' problem.…”
Section: Resultssupporting
confidence: 77%
“…For Japanese-Chinese translation, Zhang et al proposed the following three data augmentation methods to improve the quality of Japanese-Chinese NMT: (1) radicals as an additional input feature [25]; (2) the created Chinese character decomposition table [26]; (3) a corpus augmentation approach [27], considering the lack of resources in bilingual corpora.…”
Section: Guokun Et Al Automatically Built a Corpus By Crawling Langua...mentioning
confidence: 99%
“…Corpus linguistics [3] Japanese-Chinese bilingual corpora [1], TED talks, [4,5] Web-crawled corpora [7][8][9][10][11]13,14,16,17,19,24] Other corpora [6,12,18,[20][21][22] Corpus augmentation [15,23,[25][26][27] The above related research showed that corpora play an important role in improving translation accuracy and in other directions of language processing. Thus, the construction of a Japanese-Chinese bilingual corpus for NMT has significant implications for the resource scarcity problem.…”
Section: Classification Related Workmentioning
confidence: 99%
“…Recently, statistical machine translation (SMT) and Neural Machine Translation (NMT) systems have been the leading machine translation paradigms [ 1 , 2 , 3 ]. Standard SMT techniques do not depend on any linguistic information, and do not apply any pre-processing procedures to generate the translation [ 4 , 5 ].…”
Section: Introductionmentioning
confidence: 99%