Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.94
|View full text |Cite
|
Sign up to set email alerts
|

Revisiting the Context Window for Cross-lingual Word Embeddings

Abstract: Existing approaches to mapping-based crosslingual word embeddings are based on the assumption that the source and target embedding spaces are structurally similar. The structures of embedding spaces largely depend on the cooccurrence statistics of each word, which the choice of context window determines. Despite this obvious connection between the context window and mapping-based cross-lingual embeddings, their relationship has been underexplored in prior work. In this work, we provide a thorough evaluation, i… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 24 publications
0
4
0
Order By: Relevance
“… 84 , 85 Word embeddings are created by identifying the words that occur within a “Context Window” - defined by a string of words before and after a “centre” word. 86 , 87 The centre word and context words are represented as a vector of numbers (word2vec) to evaluate the presence or absence of unique words in the dataset. We use word2vec v0.3.4 in R to perform training and computations.…”
Section: Methodsmentioning
confidence: 99%
“… 84 , 85 Word embeddings are created by identifying the words that occur within a “Context Window” - defined by a string of words before and after a “centre” word. 86 , 87 The centre word and context words are represented as a vector of numbers (word2vec) to evaluate the presence or absence of unique words in the dataset. We use word2vec v0.3.4 in R to perform training and computations.…”
Section: Methodsmentioning
confidence: 99%
“…The difference between the Nesting and Flat languages is striking in Figure 4f. The Nesting encoders are consistently better at capturing the local contextual information (at positions −2 ∼ 2) than their flat counterparts, which may explain the better performance of the Nesting encoders in dependency parsing (Figure 3), given that the local contextual information is particularly important to predict the syntactic characteristics of words (Levy and Goldberg, 2014;Ri and Tsuruoka, 2020).…”
Section: Resultsmentioning
confidence: 98%
“…VecMap does not scale well without the use of a GPU, and hence hyperparameter searching was not done for this work. However, using vectors with 128 dimensions and a larger window size of 10 as suggested by Ri and Tsuruoka (2020) resulted in a performance decrease, even for the English news articles.…”
Section: Limitationsmentioning
confidence: 99%