Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: St 2018
DOI: 10.18653/v1/n18-4016
|View full text |Cite
|
Sign up to set email alerts
|

Neural Machine Translation for Low Resource Languages using Bilingual Lexicon Induced from Comparable Corpora

Abstract: Resources for the non-English languages are scarce and this paper addresses this problem in the context of machine translation, by automatically extracting parallel sentence pairs from the multilingual articles available on the Internet. In this paper, we have used an endto-end Siamese bidirectional recurrent neural network to generate parallel sentences from comparable multilingual articles in Wikipedia. Subsequently, we have showed that using the harvested dataset improved BLEU scores on both NMT and phrase-… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 16 publications
(1 citation statement)
references
References 12 publications
0
1
0
Order By: Relevance
“…However, the Transformer suffers in low-resource translation tasks [7][8][9][10][11] where there is not a large-scale parallel corpus available. The core of this problem is the mismatch between the big model capacity and the small training parallel data available.…”
Section: Introductionmentioning
confidence: 99%
“…However, the Transformer suffers in low-resource translation tasks [7][8][9][10][11] where there is not a large-scale parallel corpus available. The core of this problem is the mismatch between the big model capacity and the small training parallel data available.…”
Section: Introductionmentioning
confidence: 99%