Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1070
|View full text |Cite
|
Sign up to set email alerts
|

How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions

Abstract: Cross-lingual word embeddings (CLEs) enable multilingual modeling of meaning and facilitate cross-lingual transfer of NLP models. Despite their ubiquitous usage in downstream tasks, recent increasingly popular projectionbased CLE models are almost exclusively evaluated on a single task only: bilingual lexicon induction (BLI). Even BLI evaluations vary greatly, hindering our ability to correctly interpret performance and properties of different CLE models. In this work, we make the first step towards a comprehe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

7
137
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2
2

Relationship

2
7

Authors

Journals

citations
Cited by 141 publications
(147 citation statements)
references
References 55 publications
7
137
0
Order By: Relevance
“…For enid, we used English (100M lines) and Indonesian (77M lines) Common Crawl corpora. 5 We then mapped the word embeddings into a BWE space using VECMAP, 6 one of the best and most robust methods for unsupervised mapping (Glavas et al, 2019). The resulting BWE were used as baselines in our evaluation tasks and also to bootstrap our USMT system.…”
Section: Settings For Training Bwementioning
confidence: 99%
See 1 more Smart Citation
“…For enid, we used English (100M lines) and Indonesian (77M lines) Common Crawl corpora. 5 We then mapped the word embeddings into a BWE space using VECMAP, 6 one of the best and most robust methods for unsupervised mapping (Glavas et al, 2019). The resulting BWE were used as baselines in our evaluation tasks and also to bootstrap our USMT system.…”
Section: Settings For Training Bwementioning
confidence: 99%
“…Bilingual lexicon induction (BLI) is by far the most popular evaluation task for BWE used by previous work in spite of its limits (Glavas et al, 2019). In contrast to previous work, we used much larger test sets 10 for each language pair.…”
Section: Task 1: Bilingual Lexicon Inductionmentioning
confidence: 99%
“…Furthermore, unlike Caliskan et al (2017), we test whether biases depend on the selection of the similarity metric. Finally, given the ubiquitous adoption of cross-lingual embeddings (Ruder et al, 2017;Glavaš et al, 2019), we investigate biases in a variety of bilingual embedding spaces.…”
Section: Methodsmentioning
confidence: 99%
“…We use word2vec skip-gram model [16] with the following parameters: dimensionality of embeddings was 300, a window size of 10 words, the minimal corpus frequency of 10, negative sampling with 10 samples, no downsampling, 20 iterations over the corpus. Then we use vecmap [17,18] framework to learn a transformation matrix that maps representations in one language to the representations of the 1…”
Section: Cross-lingual Embeddingsmentioning
confidence: 99%