2021
DOI: 10.1145/3448215
|View full text |Cite
|
Sign up to set email alerts
|

Plan Optimization to Bilingual Dictionary Induction for Low-resource Language Families

Abstract: Creating bilingual dictionary is the first crucial step in enriching low-resource languages. Especially for the closely related ones, it has been shown that the constraint-based approach is useful for inducing bilingual lexicons from two bilingual dictionaries via the pivot language. However, if there are no available machine-readable dictionaries as input, we need to consider manual creation by bilingual native speakers. To reach a goal of comprehensively create multiple bilingual dictionaries, even if we alr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 23 publications
(26 reference statements)
0
5
0
Order By: Relevance
“…In order to achieve even better results, many BLI methods also apply a self-learning loop where training dictionaries are iteratively (and gradually) refined, and improved mappings are then learned in each iteration (Artetxe et al, 2018;Karan et al, 2020). However, there is still ample room for improvement, especially for lower-resource languages and dissimilar language pairs Nasution et al, 2021).…”
Section: Introductionmentioning
confidence: 99%
“…In order to achieve even better results, many BLI methods also apply a self-learning loop where training dictionaries are iteratively (and gradually) refined, and improved mappings are then learned in each iteration (Artetxe et al, 2018;Karan et al, 2020). However, there is still ample room for improvement, especially for lower-resource languages and dissimilar language pairs Nasution et al, 2021).…”
Section: Introductionmentioning
confidence: 99%
“…However, implementing the constraint-based approach on a large scale to create multiple bilingual dictionaries is still challenging, particularly in determining the constraint-based approach's execution order to reduce the total costs. Plan optimization using the Markov decision process is essential when composing the order of creation for bilingual dictionaries, considering the methods and their costs [6,9].…”
Section: Bilingual Lexicon Inductionmentioning
confidence: 99%
“…Indonesian ethnic languages are low-resource languages with a limited amount of language resources, such as bilingual dictionaries. We chose Minangkabau, Malay, Palembang, Javanese, and Sundanese as the languages to implement the proposed method in this study due to the availability of the bilingual dictionaries obtained from the results of our previous study [6]. Moreover, the Indonesian and Minangkabau languages have significant lexical similarities; thus, we presume they have several phonetic transformation rules, from Indonesian to Minangkabau and vice versa.…”
Section: Introductionmentioning
confidence: 99%
“…As an example, Banjarese is similar to Malay [39] and has a 73% lexical similarity with Indonesian [12], both languages included in mC4. Minangkabau and Indonesian also have commonalities in their vocabulary and syntax [37].…”
Section: Indobart V2mentioning
confidence: 99%