Design Challenges for Entity Linking

Ling, Xiao; Singh, Sameer; Weld, Daniel S.

doi:10.1162/tacl_a_00141

Cited by 180 publications

(172 citation statements)

References 14 publications

(24 reference statements)

Supporting

Mentioning

160

Contrasting

Unclassified

Order By: Relevance

“…We also assume that the KB is accompanied by an entity candidate selector that takes as input some text and returns a list of C potential entity links, each consisting of the start and end indices of the potential mention span and M m candidate entities in the KG: In practice, these are often implemented using precomputed dictionaries (e.g., CrossWikis; Spitkovsky and Chang, 2012), KB specific rules (e.g., a WordNet lemmatizer), or other heuristics (e.g., string match; Mihaylov and Frank, 2018). Ling et al (2015) showed that incorporating candidate priors into entity linkers can be a powerful signal, so we optionally allow for the candidate selector to return an associated prior probability for each entity candidate. In some cases, it is beneficial to over-generate potential candidates and add a special NULL entity to each candidate list, thereby allowing the linker to discriminate between actual links and false positive candidates.…”

Section: Knowledge Basesmentioning

confidence: 99%

Knowledge Enhanced Contextual Word Representations

Peters¹,

Neumann²,

Logan

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

Self Cite

501

331

View full text Add to dashboard Cite

Contextual word representations, typically trained on unstructured, unlabeled text, do not contain any explicit grounding to real world entities and are often unable to remember facts about those entities. We propose a general method to embed multiple knowledge bases (KBs) into large scale models, and thereby enhance their representations with structured, human-curated knowledge. For each KB, we first use an integrated entity linker to retrieve relevant entity embeddings, then update contextual word representations via a form of word-to-entity attention. In contrast to previous approaches, the entity linkers and selfsupervised language modeling objective are jointly trained end-to-end in a multitask setting that combines a small amount of entity linking supervision with a large amount of raw text. After integrating WordNet and a subset of Wikipedia into BERT, the knowledge enhanced BERT (KnowBert) demonstrates improved perplexity, ability to recall facts as measured in a probing task and downstream performance on relationship extraction, entity typing, and word sense disambiguation. KnowBert's runtime is comparable to BERT's and it scales to large KBs.

show abstract

Section: Knowledge Basesmentioning

confidence: 99%

Knowledge Enhanced Contextual Word Representations

Peters¹,

Neumann²,

Logan

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

Self Cite

501

331

View full text Add to dashboard Cite

show abstract

“…Incorporating the fine-grained types of a mention m can help rank entities of the appropriate type higher than others (Ling et al, 2015;Gupta et al, 2017;Raiman and Raiman, 2018). For instance, knowing the correct type of mention [Liverpool] as sports_team and constraining linking to entities with the relevant type, encourages disambiguation to the correct entity.…”

Section: Including Type Informationmentioning

confidence: 99%

Joint Multilingual Supervision for Cross-lingual Entity Linking

Upadhyay¹,

Gupta²,

Roth³

2018

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Cross-lingual Entity Linking (XEL) aims to ground entity mentions written in any language to an English Knowledge Base (KB), such as Wikipedia. XEL for most languages is challenging, owing to limited availability of resources as supervision. We address this challenge by developing the first XEL approach that combines supervision from multiple languages jointly. This enables our approach to: (a) augment the limited supervision in the target language with additional supervision from a high-resource language (like English), and (b) train a single entity linking model for multiple languages, improving upon individually trained models for each language. Extensive evaluation on three benchmark datasets across 8 languages shows that our approach significantly improves over the current state-of-theart. We also provide analyses in two limited resource settings: (a) zero-shot setting, when no supervision in the target language is available, and in (b) low-resource setting, when some supervision in the target language is available. Our analysis provides insights into the limitations of zero-shot XEL approaches in realistic scenarios, and shows the value of joint supervision in low-resource settings. 1

show abstract

“…EL [5] is similar to WSD [28,29], but it is about linking "potentially partial" entity mentions to a target KB, that has an encyclopaedic nature [30,29]. The EL problem is presented in several variants and focussing on different types of data [31,32,33,34,35,36], and it has been the subject of task-oriented evaluation procedures and benchmarks [37,38]. A few EL systems work in an unsupervised way [39,33], but the KB is still given.…”

Section: Related Workmentioning

confidence: 99%

“…Named Entity Recognition (NER) focusses on discovering mentions to entities, and it is also a basic module of several EL systems [40]. However NER is about proper nouns, as frequently is EL [31], while here we also consider common nouns. Moreover, NER systems output the entity type (person, location, etc.)…”

Section: Related Workmentioning

confidence: 99%

Learning in Text Streams: Discovery and Disambiguation of Entity and Relation Instances

Maggini

Marra

Melacci

et al. 2020

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

We consider a scenario where an artificial agent is reading a stream of text composed of a set of narrations, and it is informed about the identity of some of the individuals that are mentioned in the text portion that is currently being read. The agent is expected to learn to follow the narrations, thus disambiguating mentions and discovering new individuals. We focus on the case in which individuals are entities and relations, and we propose an end-to-end trainable memory network that learns to discover and disambiguate them in an online manner, performing oneshot learning, and dealing with a small number of sparse supervisions. Our system builds a not-given-in-advance knowledge base, and it improves its skills while reading unsupervised text. The model deals with abrupt changes in the narration, taking into account their effects when resolving co-references. We showcase the strong disambiguation and discovery skills of our model on a corpus of Wikipedia documents and on a newly introduced dataset, that we make publicly available. * c 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Our work faces the problem of learning to extract information while reading a text stream, with the aim of identifying entities and relations in the text portion that is currently being read. This problem is commonly tackled by assuming the existence of a Knowledge Base (KB) of entities and relations, where several entity/relation instances are paired with additional information, such as the common ways of referring to them or sentences/facts in which they are involved. Then, once an input sentence is provided for reading, sub-portions of text must be linked to entity or relation instances of the KB. The linking process introduces the challenging issue of dealing with multiple distinct entities (relations) that are mentioned with the same text, and thus the system has to disambiguate which is the "right" entity or relation instance of the KB for the considered text fragment. In particular, the context around the fragment or, if needed, information that was provided in the previous sentences of the text stream can be used to perform the disambiguation. As a very simple example, consider the sentence Clyde went to the office, being Clyde and the office two text fragments that indicate entities, while went to is text that is about a relation. Clyde could be the mention that is used to indicate different people in the KB, and several offices could be mentioned by the expression the office (mentions to relations follow the same logic).At a first glance, this problem shares basic principles and intuitions with several existing methods, such as Entity Linking [5], Word Sense Disambiguation [6], Named Entity Recogn...

show abstract

Design Challenges for Entity Linking

Cited by 180 publications

References 14 publications

Knowledge Enhanced Contextual Word Representations

Knowledge Enhanced Contextual Word Representations

Joint Multilingual Supervision for Cross-lingual Entity Linking

Learning in Text Streams: Discovery and Disambiguation of Entity and Relation Instances

Contact Info

Product

Resources

About