Bootstrapping Wikipedia to answer ambiguous person name queries

Gruetze, Toni; Kasneci, Gjergji; Zuo, Zhe; Naumann, Felix

doi:10.1109/icdew.2014.6818303

Cited by 4 publications

(2 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Balog et al () to compare a VSM representation with respect to using probabilistic Latent Semantic Indexing (pLSI), a topic model representation, showing that the first option reaches significantly better results. Gruetze, Kasneci, Zuo, and Naumann () also compare a VSM representation with respect to a probabilistic model obtaining the same conclusion. Artiles, Amigó, and Gonzalo () study the impact of the NEs in this problem and conclude that these features do not provide a substantial competitive advantage when they are compared with a combination of simple features that do not require linguistic preprocessing (local and global tokens, snippets,

n

‐grams, and so on).…”

Section: Related Workmentioning

confidence: 71%

See 1 more Smart Citation

Person Name Disambiguation in the Web Using Adaptive Threshold Clustering

Delgado

Martínez

Montalvo

et al. 2017

Asso for Info Science & Tech

View full text Add to dashboard Cite

In this article, we present a new clustering algorithm for Person Name Disambiguation in web search results. The algorithm groups web results according to the individuals they refer to. The best state‐of‐the‐art approaches require training data in order to learn thresholds for deciding when to group the webpages. However, the ambiguity level of person names on the web could not be previously estimated and the results of those methods strongly depend on the thresholds obtained with the training collections. We present the concept of adaptive threshold, which avoids the need of a previous supervised learning process, depending only on the content of the compared documents to decide if they refer to the same person. We evaluated our approach using three datasets reaching close results to those obtained by the most successful algorithms in the state‐of‐the‐art that require such a learning process, and outperforming the results of those obtained by algorithms that do not require it.

show abstract

n

‐grams, and so on).…”

Section: Related Workmentioning

confidence: 71%

“…In 2012, the Second CIPS‐SIGHAN Joint Conference proposed an EL task focused on Chinese person names. Recently, Gruetze et al () also presented an EL corpus that just included person names.…”

Section: Related Workmentioning

confidence: 99%