Proceedings of the 1st Workshop on Multilingual Representation Learning 2021
DOI: 10.18653/v1/2021.mrl-1.4
|View full text |Cite
|
Sign up to set email alerts
|

Do not neglect related languages: The case of low-resource Occitan cross-lingual word embeddings

Abstract: Cross-lingual word embeddings (CLWEs) have proven indispensable for various natural language processing tasks, e.g., bilingual lexicon induction (BLI). However, the lack of data often impairs the quality of representations. Various approaches requiring only weak crosslingual supervision were proposed, but current methods still fail to learn good CLWEs for languages with only a small monolingual corpus. We therefore claim that it is necessary to explore further datasets to improve CLWEs in low-resource setups. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(14 citation statements)
references
References 22 publications
1
7
0
Order By: Relevance
“…Across all language pairs, we see substantial gains from our method as compared to mapping, joint, and other hybrid baselines. (Woller et al, 2021) outperform our approach on two language pairs (Ne-En, Gu-En) which we suspect is due to Joint-Align's performance in comparison to regular Joint training. However, these improvements are not consistent.…”
Section: Blimentioning
confidence: 76%
See 4 more Smart Citations
“…Across all language pairs, we see substantial gains from our method as compared to mapping, joint, and other hybrid baselines. (Woller et al, 2021) outperform our approach on two language pairs (Ne-En, Gu-En) which we suspect is due to Joint-Align's performance in comparison to regular Joint training. However, these improvements are not consistent.…”
Section: Blimentioning
confidence: 76%
“…Although our proposed framework does not make any significant changes to the mapping and joint components, the combination of the two cross-lingual approaches leads to better embeddings both in terms of quality, shown by the performance in BLI, as well as structure, shown by the eigenvalue similarity scores. In addition to this, MGPA as well as (Woller et al, 2021)'s method attains good eigenvalue similarity scores suggesting that the incorporation of a related language is indeed helpful…”
Section: Eigenvalue Similaritymentioning
confidence: 80%
See 3 more Smart Citations