Proceedings of the First International Workshop on Entity Recognition &Amp; Disambiguation - ERD '14 2014
DOI: 10.1145/2633211.2634349
|View full text |Cite
|
Sign up to set email alerts
|

Entity linking based on the co-occurrence graph and entity probability

Abstract: This paper describes our system for the Entity Recognition and Disambiguation Challenge 2014. There are two tasks: one to find entities in queries (Short Track), the other to find entities in texts from web pages (Long Track).We have participated in both tracks with the same system tuned to each of the tasks. On the final test set, we reached the f-measure of 71.9% on the Long Track and of 66.9% on the Short Track. We describe our system and its components in depth, together with their influence on performance… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2014
2014
2018
2018

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(9 citation statements)
references
References 7 publications
0
9
0
Order By: Relevance
“…The winner of the short-text track is the SMAPH team [3] followed by NTUNLP [2] and Seznam Research [7]. The winner of the long-text track is the MS MLI team [4], followed by MLNS [17] and Seznam Research [7].…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The winner of the short-text track is the SMAPH team [3] followed by NTUNLP [2] and Seznam Research [7]. The winner of the long-text track is the MS MLI team [4], followed by MLNS [17] and Seznam Research [7].…”
Section: Resultsmentioning
confidence: 99%
“…Finally, Jiri Materna from Seznam Research described their system [7], which participated in both tracks. Their entity disambiguation process is based on co-occurrence analysis using a graph of candidate entities, with links between entities extracted from Wikipedia articles.…”
Section: Paper Presentationmentioning
confidence: 99%
“…Entities drawn from Source 3, our largest source of candidates, are associated with a set of features (9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24) relative to the process of snippet annotation performed by WAT. Feature freq (how many snippets mention the entity) is an obvious indicator of an entity's correctness.…”
Section: Entity Featuresmentioning
confidence: 99%
“… POS tags and rules. A couple of authors use part of speech (POS) taggers and/or several rules in order to identify named entities [19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34]. The rules range from simple rules such as "capitalized letter" (if a word contains a capitalized letter the word will be treated as a spot), stop word lists, "At Least One Noun Selector"-rule to complex, combined rules.…”
Section: State Of the Art In Entity Detection (Spotting)mentioning
confidence: 99%
“… Dictionary based techniques. The majority of approaches leverage techniques based on dictionaries [6,19,31,[35][36][37][38][39][40][41][42][43][44][45]. The structure of Wikipedia provides useful features for generating dictionaries: ─ Entity pages: Each page in Wikipedia contains a title (e.g.…”
Section: State Of the Art In Entity Detection (Spotting)mentioning
confidence: 99%