Entity Disambiguation by Knowledge and Text Jointly Embedding

Fang, Wei; Zhang, Jianwen; Wang, Dilin; Zheng, Chen; Li, Ming

doi:10.18653/v1/k16-1026

Cited by 66 publications

(58 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Such embedding models enable us to design NED models that capture the contextual information required to address NED. These models are typically based on conventional word embedding models (e.g., skip-gram (Mikolov et al, 2013)) that assign a fixed embedding to each word and entity (Yamada et al, 2016;Fang et al, 2016;Tsai and Roth, 2016;Cao et al, 2017;Ganea and Hofmann, 2017). In this study, we aim to test the effectiveness of the pretrained contextualized embeddings for NED.…”

Section: Background and Related Workmentioning

confidence: 99%

Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation

Yamada¹,

Shindo

Takeda

et al. 2016

Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning

297

457

View full text Add to dashboard Cite

Named Entity Disambiguation (NED) refers to the task of resolving multiple named entity mentions in a document to their correct references in a knowledge base (KB) (e.g., Wikipedia). In this paper, we propose a novel embedding method specifically designed for NED. The proposed method jointly maps words and entities into the same continuous vector space. We extend the skip-gram model by using two models. The KB graph model learns the relatedness of entities using the link structure of the KB, whereas the anchor context model aims to align vectors such that similar words and entities occur close to one another in the vector space by leveraging KB anchors and their context words. By combining contexts based on the proposed embedding with standard NED features, we achieved state-of-theart accuracy of 93.1% on the standard CoNLL dataset and 85.2% on the TAC 2010 dataset.

show abstract

Section: Background and Related Workmentioning

confidence: 99%

Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation

Yamada¹,

Shindo

Takeda

et al. 2016

Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning

297

457

View full text Add to dashboard Cite

show abstract

“…Although the learning of the embeddings might seem straightforward, as it uses the standard skip-gram model, we see this as an advantage. On one hand, it allows our training to scale efficiently to huge vocabulary of words and concepts without the need for a lot of preprocessing (e.g., removing low frequent words and phrases as in Wang et al (2014); Fang et al (2016)). On the other hand, to learn from the knowledge graph contexts, we propose simple adaption to the skip-gram model (cf.…”

Section: Related Workmentioning

confidence: 99%

“…This is a simpler and more computationally efficient function than the scoring function proposed by previous approaches which learn from knowledge graphs (cf. Fang et al (2016)'s equation 1).…”

Section: Related Workmentioning

confidence: 99%

Beyond word embeddings: learning entity and concept representations from large scale knowledge bases

2018

View full text Add to dashboard Cite

Text representations using neural word embeddings have proven effective in many NLP applications. Recent researches adapt the traditional word embedding models to learn vectors of multiword expressions (concepts/entities). However, these methods are limited to textual knowledge bases (e.g., Wikipedia). In this paper, we propose a novel and simple technique for integrating the knowledge about concepts from two large scale knowledge bases of different structure (Wikipedia, and Probase) in order to learn concept representations. We adapt the efficient skip-gram model to seamlessly learn from the knowledge in Wikipedia text and Probase concept graph. We evaluate our concept embedding models on two tasks: 1) analogical reasoning, where we achieve a stateof-the-art performance of 91% on semantic analogies, 2) concept categorization, where we achieve a state-of-the-art performance on two benchmark datasets achieving categorization accuracy of 100% on one and 98% on the other. Additionally, we present a case study to evaluate our model on unsupervised argument type identification for neural semantic parsing. We demonstrate the competitive accuracy of our unsupervised method and its ability to better generalize to out of vocabulary entity mentions compared to the tedious and error prone methods which depend on gazetteers and regular expressions.In this paper, we use the terms "concept" and "entity" interchangeably.

show abstract

“…We represent entities on three levels: entity, word and character. Our entity-level representation is similar to work on relation extraction (Wang et al, 2014;, entity linking (Yamada et al, 2016;Fang et al, 2016), and entity typing (Yaghoobzadeh and Schütze, 2015). Our word-level representation with distributional word embeddings is similarly used to represent entities for entity linking and relation extraction (Socher et al, 2013;Wang et al, 2014).…”

Section: Related Workmentioning

confidence: 99%

Multi-level Representations for Fine-Grained Typing of Knowledge Base Entities

Yaghoobzadeh

Schütze

2017

Proceedings of the 15th Conference of the European Chapter of The Association for Computational Linguistics: Volume 1

View full text Add to dashboard Cite

Entities are essential elements of natural language. In this paper, we present methods for learning multi-level representations of entities on three complementary levels: character (character patterns in entity names extracted, e.g., by neural networks), word (embeddings of words in entity names) and entity (entity embeddings). We investigate state-of-theart learning methods on each level and find large differences, e.g., for deep learning models, traditional ngram features and the subword model of fasttext (Bojanowski et al., 2016) on the character level; for word2vec (Mikolov et al., 2013) on the word level; and for the order-aware model wang2vec (Ling et al., 2015a) on the entity level. We confirm experimentally that each level of representation contributes complementary information and a joint representation of all three levels improves the existing embedding based baseline for fine-grained entity typing by a large margin. Additionally, we show that adding information from entity descriptions further improves multi-level representations of entities.

show abstract

Entity Disambiguation by Knowledge and Text Jointly Embedding

Cited by 66 publications

References 11 publications

Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation

Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation

Beyond word embeddings: learning entity and concept representations from large scale knowledge bases

Multi-level Representations for Fine-Grained Typing of Knowledge Base Entities

Contact Info

Product

Resources

About