Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Confere 2015
DOI: 10.3115/v1/p15-1014
|View full text |Cite
|
Sign up to set email alerts
|

Learning Word Representations by Jointly Modeling Syntagmatic and Paradigmatic Relations

Abstract: Vector space representation of words has been widely used to capture fine-grained linguistic regularities, and proven to be successful in various natural language processing tasks in recent years. However, existing models for learning word representations focus on either syntagmatic or paradigmatic relations alone. In this paper, we argue that it is beneficial to jointly modeling both relations so that we can not only encode different types of linguistic properties in a unified way, but also boost the represen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 42 publications
(37 citation statements)
references
References 19 publications
0
16
0
Order By: Relevance
“…Results of the word analogy test are often accompanied by a visualisation of projected word vectors to the two dimensional plane using Principal Component Analysis (PCA) ( (Mikolov et al, 2013a;Sun et al, 2015) for example). Though these are generally not claimed to be part of the evaluation, the visualisations are included to convince the reader of the quality of the word embeddings with respect to word analogies-the line connecting a 1 and b 1 being approximately parallel to the line through a 2 and b 2 whenever word analogy recovery is optimal (as in Equation ( 2)).…”
Section: Pca To Two Dimensions From Dimension D Can Be Misleadingmentioning
confidence: 99%
“…Results of the word analogy test are often accompanied by a visualisation of projected word vectors to the two dimensional plane using Principal Component Analysis (PCA) ( (Mikolov et al, 2013a;Sun et al, 2015) for example). Though these are generally not claimed to be part of the evaluation, the visualisations are included to convince the reader of the quality of the word embeddings with respect to word analogies-the line connecting a 1 and b 1 being approximately parallel to the line through a 2 and b 2 whenever word analogy recovery is optimal (as in Equation ( 2)).…”
Section: Pca To Two Dimensions From Dimension D Can Be Misleadingmentioning
confidence: 99%
“…To approximate novelty, we use word embeddings (computed over the OMCS corpus) to calculate distance d(a, b) = ||head(a) − head(b)|| 2 + ||tail(a) − tail(b)|| 2 , where head and tail are X X X X X X X X represented by the average of word embeddings. Such a formulation is related to the concept of paradigmatic similarity (Sahlgren, 2006), and word embedding-based distance can approximate paradigmatic similarity (Sun et al, 2015). Two words are paradigmatically similar if one can be replaced for the other, while maintaining syntactical correctness of the sentence (e.g.…”
Section: Automatically Measuring Noveltymentioning
confidence: 99%
“…For some tasks, the evaluation is directly conducted over the embedding (e.g., measuring the cosine similarity between word vectors); whereas for others, a classifier is Pre-trained Embedding. We perform experiments with the GloVe embedding (Pennington, Socher, and Manning 2014) and the HDC embedding (Sun et al 2015). The GloVe embedding is trained from 42B tokens of Common Crawl data.…”
Section: Resultsmentioning
confidence: 99%