2017
DOI: 10.1371/journal.pone.0184544
|View full text |Cite
|
Sign up to set email alerts
|

Learning linear transformations between counting-based and prediction-based word embeddings

Abstract: Despite the growing interest in prediction-based word embedding learning methods, it remains unclear as to how the vector spaces learnt by the prediction-based methods differ from that of the counting-based methods, or whether one can be transformed into the other. To study the relationship between counting-based and prediction-based embeddings, we propose a method for learning a linear transformation between two given sets of word embeddings. Our proposal contributes to the word embedding learning research in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(8 citation statements)
references
References 20 publications
0
8
0
Order By: Relevance
“…A couple of very recent papers propose methods to align embeddings after their construction, but focus on affine transformations, as opposed to the more restrictive but distance preserving rotations of our method. Bollegala et al [4] uses gradient descent, for parameter γ, to directly optimize…”
Section: Related Approachesmentioning
confidence: 99%
See 2 more Smart Citations
“…A couple of very recent papers propose methods to align embeddings after their construction, but focus on affine transformations, as opposed to the more restrictive but distance preserving rotations of our method. Bollegala et al [4] uses gradient descent, for parameter γ, to directly optimize…”
Section: Related Approachesmentioning
confidence: 99%
“…GloVe: [4] The GloVe model is a log-bilinear model based on ratios of word-word co-occurrence frequencies. The training objective is for the dot product of the vectors learned for words to equal the logarithm of their cooccurrence frequency.…”
Section: Different Word Embeddingsmentioning
confidence: 99%
See 1 more Smart Citation
“…Linear transformations are regularly used to transfer between different embeddings or to adapt to a new domain (Bollegala et al, 2017;Arora et al, 2018b). The linear transformation can encode contextual information, an idea utilized recently by (Khodak et al, 2018) who applied a linear transformation on the DisC embedding scheme to construct a new embedding scheme (referred to as à la carte embedding), and empirically showed that it outperforms many other popular word sequence embedding schemes.…”
Section: Introductionmentioning
confidence: 99%
“…One promising approach is the use of other information such as multimodal information (Bruni et al, 2014;Kiela et al, 2014;Kiela and Clark, 2015;Kiela et al, 2015a;Silberer et al, 2017) and language resources Kiela et al, 2015b;Rothe and Schütze, 2017;Yu and Dredze, 2014). Other refinement methods include task-specific embeddings (Bolukbasi et al, 2016;Yu et al, 2017) and the selective use of multiple embeddings (Bollegala et al, 2017;Kiela et al, 2018).…”
Section: Introductionmentioning
confidence: 99%