2022
DOI: 10.48550/arxiv.2201.05700
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Cost-Effective Training in Low-Resource Neural Machine Translation

Abstract: While Active Learning (AL) techniques are explored in Neural Machine Translation (NMT), only a few works focus on tackling low annotation budgets where a limited number of sentences can get translated. Such situations are especially challenging and can occur for endangered languages with few human annotators or having cost constraints to label large amounts of data. Although AL is shown to be helpful with large budgets, it is not enough to build high-quality translation systems in these low-resource conditions… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 17 publications
(29 reference statements)
0
3
0
Order By: Relevance
“…Random sampling is often surprisingly powerful (Kendall and Smith, 1938;Knuth, 1991;Sennrich et al, 2016a). There is extensive research to beat random sampling by methods based on entropy (Koneru et al, 2022), coverage and uncertainty (Peris and Casacuberta, 2018;Zhao et al, 2020), clustering Gangadharaiah et al, 2009), consensus , syntactic parsing (Miura et al, 2016), density and diversity (Koneru et al, 2022;Ambati et al, 2011), and learning to learn active learning strategies (Liu et al, 2018).…”
Section: Active Learning In Machine Translationmentioning
confidence: 99%
See 1 more Smart Citation
“…Random sampling is often surprisingly powerful (Kendall and Smith, 1938;Knuth, 1991;Sennrich et al, 2016a). There is extensive research to beat random sampling by methods based on entropy (Koneru et al, 2022), coverage and uncertainty (Peris and Casacuberta, 2018;Zhao et al, 2020), clustering Gangadharaiah et al, 2009), consensus , syntactic parsing (Miura et al, 2016), density and diversity (Koneru et al, 2022;Ambati et al, 2011), and learning to learn active learning strategies (Liu et al, 2018).…”
Section: Active Learning In Machine Translationmentioning
confidence: 99%
“…Let I l (•) and I r (•) be indicator functions to show whether a sentence belongs to the left or the right. We aim to maximize the diversity H c and optimize density by adjusting H l and H r (Koneru et al, 2022).…”
mentioning
confidence: 99%
“…Another work that utilizes backtranslation for effecctive NMT training is done by Dou et al (2020). Koneru et al (2022) proposes a cost-effective training procedure to increase the performance of models on NMT tasks, utilizing a small number of annotated sentences and dictionary entries. Park et al (2020) looked into decoding strategies for low-resourced languages in an attempt to improve training.…”
Section: Introductionmentioning
confidence: 99%