Proceedings of the 13th International Conference on Computational Semantics - Long Papers 2019
DOI: 10.18653/v1/w19-0408
|View full text |Cite
|
Sign up to set email alerts
|

Words are Vectors, Dependencies are Matrices: Learning Word Embeddings from Dependency Graphs

Abstract: Distributional Semantic Models (DSMs) construct vector representations of word meanings based on their contexts. Typically, the contexts of a word are defined as its closest neighbours, but they can also be retrieved from its syntactic dependency relations. In this work, we propose a new dependencybased DSM. The novelty of our model lies in associating an independent meaning representation, a matrix, with each dependency-label. This allows it to capture specifics of the relations between words and contexts, le… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 10 publications
(17 citation statements)
references
References 25 publications
0
17
0
Order By: Relevance
“…For both context-based and hybrid few-shot learning, we have achieved a new state of the art on 4 out of the 6 evaluation tasks used, showing that a careful, optimised approach can be the key to success in few-shot learning. Future work could explore other distributional models, such as dependency embeddings Czarnowska et al, 2019), but it is clear from our results that careful optimisation will be required to adapt other models to the few-shot setting.…”
Section: Discussionmentioning
confidence: 99%
“…For both context-based and hybrid few-shot learning, we have achieved a new state of the art on 4 out of the 6 evaluation tasks used, showing that a careful, optimised approach can be the key to success in few-shot learning. Future work could explore other distributional models, such as dependency embeddings Czarnowska et al, 2019), but it is clear from our results that careful optimisation will be required to adapt other models to the few-shot setting.…”
Section: Discussionmentioning
confidence: 99%
“…In both domains, the shared goals are: i) map entities v 2 V to embeddings e v where e 2 R |V|⇥n , n being the dimensionality of the vectors; ii) map relations r 2 R in one -or more -space R |R|⇥⇤ . In this work, we focus on constructing a syntactic dataset of positive training triples from a corpus as in Czarnowska et al (2019). All of the models we investigate rely on a negative sampling mechanism that generates a dataset D 0 of false triples.…”
Section: Theoretical Approachmentioning
confidence: 99%
“…Representing words in terms of their syntactic co-occurrences has been long proposed, both for count-based (Padó and Lapata, 2007;Weir et al, 2016), and neural (Hermann and Blunsom, 2013;Levy and Goldberg, 2014;Komninos and Manandhar, 2016;Czarnowska et al, 2019;Vashishth et al, 2019) models of word meaning. Tested on benchmark word similarity tasks, such models often perform favourably to models based on proximal co-occurrence, particularly when the similarity or substitutability of two words is considered rather than their relatedness (Levy and Goldberg, 2014).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Vector addition (Rimell et al, 2016) .496 .472 Simplified Practical Lexical Function (Rimell et al, 2016) .496 .497 Vector addition (Czarnowska et al, 2019) .485 .475 Dependency vector addition (Czarnowska et al, 2019) .497 .439 Semantic functions (Emerson and Copestake, 2017b) .20 .16 Sem-func & vector ensemble (Emerson and Copestake, 2017b Previous work has shown that vector addition performs well on this task (Rimell et al, 2016;Czarnowska et al, 2019). I have trained a Skipgram model (Mikolov et al, 2013) using the Gensim library (Řehůřek and Sojka, 2010), tuning weighted addition on the dev set.…”
Section: Previous Workmentioning
confidence: 99%