Entity linking based on the co-occurrence graph and entity probability

Eckhardt, Adam; Hreško, Juraj; Procházka, Jan; Smrs, Otakar

doi:10.1145/2633211.2634349

Cited by 12 publications

(9 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The winner of the short-text track is the SMAPH team [3] followed by NTUNLP [2] and Seznam Research [7]. The winner of the long-text track is the MS MLI team [4], followed by MLNS [17] and Seznam Research [7].…”

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

Erd'14

et al. 2014

View full text Add to dashboard Cite

In this paper we overview the 2014 Entity Recognition and Disambiguation Challenge (ERD'14), which took place from March to June 2014 and was summarized in a dedicated workshop at SIGIR 2014. The main goal of the ERD challenge was to promote research in recognition and disambiguation of entities in unstructured text. Unlike many past entity linking challenges, no mention segmentations were given to the participating systems for a given document. Participants were asked to implement a web service for their system to minimize human involvement during evaluation and to enable measuring the processing times. The challenge has attracted a lot of interest (over 100 teams registered, and 27 of those submitted final results).In this paper we cover the task definition, issues encountered during annotation, and provide a detailed analysis of all the participating systems. Specifically, we show how we adapted the pooling technique to address the difficulties of gathering annotations for the entity linking task. We also summarize the ERD workshop that followed the challenge, including the oral and poster presentations as well as the invited talks.

show abstract

Section: Resultsmentioning

confidence: 99%

“…Finally, Jiri Materna from Seznam Research described their system [7], which participated in both tracks. Their entity disambiguation process is based on co-occurrence analysis using a graph of candidate entities, with links between entities extracted from Wikipedia articles.…”

Section: Paper Presentationmentioning

confidence: 99%

Erd'14

et al. 2014

View full text Add to dashboard Cite

show abstract

“…Entities drawn from Source 3, our largest source of candidates, are associated with a set of features (9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24) relative to the process of snippet annotation performed by WAT. Feature freq (how many snippets mention the entity) is an obvious indicator of an entity's correctness.…”

Section: Entity Featuresmentioning

confidence: 99%

Smaph

Cornolti

Ferragina

Ciaramita

et al. 2018

ACM Trans. Inf. Syst.

View full text Add to dashboard Cite

We study the problem of linking the terms of a web-search query to a semantic representation given by the set of entities (a.k.a. concepts) mentioned in it. We introduce SMAPH, a system that performs this task using the information coming from a web search engine, an approach we call "piggybacking." We employ search engines to alleviate the noise and irregularities that characterize the language of queries. Snippets returned as search results also provide a context for the query that makes it easier to disambiguate the meaning of the query. From the search results, SMAPH builds a set of candidate entities with high coverage. This set is filtered by linking back the candidate entities to the terms occurring in the input query, ensuring high precision. A greedy disambiguation algorithm performs this filtering; it maximizes the coherence of the solution by iteratively discovering the pertinent entities mentioned in the query. We propose three versions of SMAPH that outperform state-of-the-art solutions on the known benchmarks and on the GERDAQ dataset, a novel dataset that we have built specifically for this problem via crowd-sourcing and that we make publicly available.

show abstract

“… POS tags and rules. A couple of authors use part of speech (POS) taggers and/or several rules in order to identify named entities [19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34]. The rules range from simple rules such as "capitalized letter" (if a word contains a capitalized letter the word will be treated as a spot), stop word lists, "At Least One Noun Selector"-rule to complex, combined rules.…”

Section: State Of the Art In Entity Detection (Spotting)mentioning

confidence: 99%

“… Dictionary based techniques. The majority of approaches leverage techniques based on dictionaries [6,19,31,[35][36][37][38][39][40][41][42][43][44][45]. The structure of Wikipedia provides useful features for generating dictionaries: ─ Entity pages: Each page in Wikipedia contains a title (e.g.…”

Section: State Of the Art In Entity Detection (Spotting)mentioning

confidence: 99%

Improving Language-Dependent Named Entity Detection

Petz

Wetzlinger

Nedbal

2017

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Named Entity Recognition (NER) and Named Entity Linking (NEL) are two research areas that have shown big advancements in recent years. The majority of this research is based on the English language. Hence, some of these improvements are language-dependent and do not necessarily lead to better results when applied to other languages. Therefore, this paper discusses TOMO, an approach to language-aware named entity detection and evaluates it for the German language. This also required the development of a German gold standard dataset, which was based on the English dataset used by the OKE 2016 challenge. An evaluation of the named entity detection task using the web-based platform GERBIL was undertaken and results show that our approach produced higher F1 values than the other annotators did. This indicates that language-dependent features do improve the overall quality of the spotter.

show abstract

Entity linking based on the co-occurrence graph and entity probability

Cited by 12 publications

References 7 publications

Erd'14

Erd'14

Smaph

Improving Language-Dependent Named Entity Detection

Contact Info

Product

Resources

About