2019
DOI: 10.1007/978-981-13-9282-5_47
|View full text |Cite
|
Sign up to set email alerts
|

OdiEnCorp: Odia–English and Odia-Only Corpus for Machine Translation

Abstract: The preparation of parallel corpora is a challenging task, particularly for languages that suffer from under-representation in the digital world. In a multi-lingual country like India, the need for such parallel corpora is stringent for several low-resource languages. In this work, we provide an extended English-Odia parallel corpus, OdiEnCorp 2.0, aiming particularly at Neural Machine Translation (NMT) systems which will help translate English↔Odia. OdiEnCorp 2.0 includes existing English-Odia corpora and we … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 18 publications
0
6
0
Order By: Relevance
“…The machine translation evaluation matrices BLEU (Papineni et al, 2002) and ChrF (Popović, 2017) used by the organizers to evaluate the submissions. Based on our observation, the statistical approach performed well as compared to NMT for many language pairs as shown in the Table 2 (Parida et al, 2019). Also, among NMT model settings one-to-one and oneto-many perform well based on the language pairs.…”
Section: Resultsmentioning
confidence: 66%
“…The machine translation evaluation matrices BLEU (Papineni et al, 2002) and ChrF (Popović, 2017) used by the organizers to evaluate the submissions. Based on our observation, the statistical approach performed well as compared to NMT for many language pairs as shown in the Table 2 (Parida et al, 2019). Also, among NMT model settings one-to-one and oneto-many perform well based on the language pairs.…”
Section: Resultsmentioning
confidence: 66%
“…The machine translation evaluation matrices BLEU and ChrF used by the organizers to evaluate the submissions. Based on our observation, the statistical approach performed well as compared to NMT for many language pairs as shown in the Table 2 (Parida et al, 2019). Also, among NMT model settings one-to-one and oneto-many perform well based on the language pairs.…”
Section: Resultsmentioning
confidence: 66%
“…This leads to depending entirely on accurate document pair retrieval for extracting a corpus of reasonable alignment accuracy. The preliminary efforts used to retrieve and align data in Siripragada et al [2020] comprises of Bible [Parida et al 2020] 2 and 4 we observe increase in BLEU scores when translating to English. This leads to better retrieval and sentence level accuracies.…”
Section: Discussionmentioning
confidence: 70%