Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 2 - EMNLP '09 2009
DOI: 10.3115/1699571.1699582
|View full text |Cite
|
Sign up to set email alerts
|

The role of named entities in web people search

Abstract: The ambiguity of person names in the Web has become a new area of interest for NLP researchers. This challenging problem has been formulated as the task of clustering Web search results (returned in response to a person name query) according to the individual they mention. In this paper we compare the coverage, reliability and independence of a number of features that are potential information sources for this clustering task, paying special attention to the role of named entities in the texts to be clustered.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2011
2011
2019
2019

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 32 publications
(25 citation statements)
references
References 19 publications
0
25
0
Order By: Relevance
“…it focuses on the area where both clusters are closets to each other). A drawback of this method is that clusters may be merged due to single noisy elements being close to each other, but in practice it seems to be the best choice in problems related to ours [23,28,5]. …”
Section: Learning a Similarity Functionmentioning
confidence: 99%
See 1 more Smart Citation
“…it focuses on the area where both clusters are closets to each other). A drawback of this method is that clusters may be merged due to single noisy elements being close to each other, but in practice it seems to be the best choice in problems related to ours [23,28,5]. …”
Section: Learning a Similarity Functionmentioning
confidence: 99%
“…Following the methodology proposed in [5] for a different clustering problem, we model the problem as a binary classification task: given a pair of tweets d1, d2 , the system must decide whether the tweets belong to the same topic (true) or not (false). Each pair of tweets is represented as a set of features (for instance, term overlapping between both tweets), which are used to feed a machine learning algorithm that learns a similarity function.…”
Section: Modeling Similarity As a Classification Taskmentioning
confidence: 99%
“…As named entity recognition (NER) is used in most approaches, Artiles et al investigated which document features contribute to person name disambiguation and reported that NER only makes a small contribution [4].…”
Section: Related Work and Discussionmentioning
confidence: 99%
“…Two evaluation metrics are employed during the unsupervised evaluation in order to estimate the quality of the clustering solutions, the V-measure [24] and the paired F-Score [25]. V-Measure assesses the quality of a clustering by measuring its homogeneity (h) and its completeness (c).…”
Section: Evaluation Measuresmentioning
confidence: 99%
“…In the paired F-Score [25] evaluation, the clustering problem is transformed into a classification problem [4]. A set of instance pairs is generated from the automatically induced clusters (F(K)), which comprises pairs of the instances found in each cluster.…”
Section: Evaluation Measuresmentioning
confidence: 99%