Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2015
DOI: 10.3115/v1/n15-1165
|View full text |Cite
|
Sign up to set email alerts
|

Random Walks and Neural Network Language Models on Knowledge Bases

Abstract: Random walks over large knowledge bases like WordNet have been successfully used in word similarity, relatedness and disambiguation tasks. Unfortunately, those algorithms are relatively slow for large repositories, with significant memory footprints. In this paper we present a novel algorithm which encodes the structure of a knowledge base in a continuous vector space, combining random walks and neural net language models in order to produce novel word representations. Evaluation in word relatedness and simila… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
50
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 52 publications
(51 citation statements)
references
References 13 publications
0
50
0
Order By: Relevance
“…Our assumption is that such a sequence can be considered a context of its starting node: a set of words that are related to, and can appear together in real texts with, the word sense represented by that node, thus emulating real text sentences; to what extent this assumption holds depends of course on the structure of the LKB we are using. Previous efforts in building word embeddings have shown the plausibility of this approach (Goikoetxea et al, 2015).…”
Section: Random Walks As Contextsmentioning
confidence: 99%
“…Our assumption is that such a sequence can be considered a context of its starting node: a set of words that are related to, and can appear together in real texts with, the word sense represented by that node, thus emulating real text sentences; to what extent this assumption holds depends of course on the structure of the LKB we are using. Previous efforts in building word embeddings have shown the plausibility of this approach (Goikoetxea et al, 2015).…”
Section: Random Walks As Contextsmentioning
confidence: 99%
“…As baselines, we evaluated two textcorpus-based word embeddings that are freely available on the web, as well as the best result of Goikoetxea et al (Goikoetxea et al, 2015), available from the UKB web page 9 . Thus, the pseudocorpus-based embeddings have been compared with text-based embeddings.…”
Section: Experiments Resultsmentioning
confidence: 99%
“…We reuse the sets of relations developed in these works to generate our Pseudo Corpora LC. Goikoetxea et al 2015(Goikoetxea et al, 2015 describe an architecture in which a run of the Random Walk algorithm (Agirre et al, 2014) produces an artificial corpus from WordNet. The graph that is fed to the algorithm is composed of WordNet synsets (the graph nodes) and of different types of relations between them (the graph arcs; some relation types are antonymy, hypernymy, derivation, etc.).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Complementary to this, a plethora of works in Natural Language Processing (NLP) has recently focused on combining knowledge bases with distributional information from text. These include approaches that modify Word2Vec [15] to learn sense embeddings [5], methods to enrich WordNet with embeddings for synsets and lexemes [21], acquire continuous word representations by combining random walks over knowledge bases and neural language models [11], or produce joint lexical and semantic vectors for sense representation from text and knowledge bases [4] In this paper, we follow this line of research and take it one step forward by producing a hybrid knowledge resource, which combines symbolic and statistical meaning representations while i) staying purely on the lexical-symbolic level, ii) explicitly distinguishing word senses, and iii) being human readable. Far from being technicalities, such properties are crucial to be able to embed a resource of this kind into the Semantic Web ecosystem, where human-readable distributional representations are explicitly linked to URIfied semantic resources.…”
Section: Introductionmentioning
confidence: 99%