2020
DOI: 10.1017/s1351324920000303
|View full text |Cite
|
Sign up to set email alerts
|

Neural machine translation of low-resource languages using SMT phrase pair injection

Abstract: Neural machine translation (NMT) has recently shown promising results on publicly available benchmark datasets and is being rapidly adopted in various production systems. However, it requires high-quality large-scale parallel corpus, and it is not always possible to have sufficiently large corpus as it requires time, money, and professionals. Hence, many existing large-scale parallel corpus are limited to the specific languages and domains. In this paper, we propose an effective approach to improve an NMT syst… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 23 publications
(15 citation statements)
references
References 31 publications
(66 reference statements)
0
11
0
Order By: Relevance
“…This score was improved later in the task of WAT2020 (Laskar et al, 2020c) and utilizes pre-train word embeddings of the monolingual corpus and additional parallel data of IITB. This work attempts to utilize phrase pairs (Sen et al, 2020) to enhance the translational performance of the WAT2021: English to Hindi multimodal translation task.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…This score was improved later in the task of WAT2020 (Laskar et al, 2020c) and utilizes pre-train word embeddings of the monolingual corpus and additional parallel data of IITB. This work attempts to utilize phrase pairs (Sen et al, 2020) to enhance the translational performance of the WAT2021: English to Hindi multimodal translation task.…”
Section: Related Workmentioning
confidence: 99%
“…The attention-based NMT yields substantial performance for Indian language translation Laskar et al, 2019aLaskar et al, ,b, 2020aLaskar et al, , 2021b. Moreover, NMT performance can be enhanced by utilizing monolingual data (Sennrich et al, 2016;Zhang and Zong, 2016;Laskar et al, 2020b) and phrase pair injection (Sen et al, 2020), effective in low resource language pair translation. This paper aims English to Hindi translation using the multimodal concept by taking advantage of monolingual data and phrase pair injections to improve the translation quality at the WAT2021 translation task.…”
Section: Introductionmentioning
confidence: 99%
“…In (Sen et al, 2020), authors used SMT-based phrase pairs to augment with the original parallel data to improve low-resource language pairs translation. In SMT 3 , Giza++ word alignment tool is used to extract phrase pair.…”
Section: Data Augmentationmentioning
confidence: 99%
“…In SMT 3 , Giza++ word alignment tool is used to extract phrase pair. Inspired by the work (Sen et al, 2020), we have extracted phrase pairs using Giza++ 4 . Then after removing duplicates and blank lines, the obtained phrase pairs are augmented to the original parallel data.…”
Section: Data Augmentationmentioning
confidence: 99%
See 1 more Smart Citation