2012
DOI: 10.5715/jnlp.19.89
|View full text |Cite
|
Sign up to set email alerts
|

Entity Set Expansion based on Bootstrapping Methods using Topic Information

Abstract: This paper proposes three modules based on latent topics of documents for alleviating "semantic drift" in bootstrapping entity set expansion. These new modules are added to a discriminative bootstrapping algorithm to realize topic feature generation, negative example selection and positive example disambiguation. In this study, we model latent topics with LDA (Latent Dirichlet Allocation) in an unsupervised way. Experiments show that the accuracy of the extracted entities is improved by 6.7 to 28.2% depending … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2012
2012
2020
2020

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 16 publications
0
6
0
Order By: Relevance
“…Many researchers have presented various approaches to reduce the effects of semantic drift. The approaches range from refinement of the seed set , applying classifier (Bellare et al, 2007;Sadamitsu et al, 2011;Pennacchiotti and Pantel, 2011), using human judges , to using relationships between semantic categories (Curran et al, 2007;Carlson et al, 2010). investigated the influence of seed instances on bootstrapping algorithms.…”
Section: Approaches To Semantic Driftmentioning
confidence: 99%
See 1 more Smart Citation
“…Many researchers have presented various approaches to reduce the effects of semantic drift. The approaches range from refinement of the seed set , applying classifier (Bellare et al, 2007;Sadamitsu et al, 2011;Pennacchiotti and Pantel, 2011), using human judges , to using relationships between semantic categories (Curran et al, 2007;Carlson et al, 2010). investigated the influence of seed instances on bootstrapping algorithms.…”
Section: Approaches To Semantic Driftmentioning
confidence: 99%
“…The classifier approach can use multiple features to select instances. Sadamitsu et al (2011) extended the method of Bellare et al (2007) to use topic information estimated using Latent Dirichlet Allocation (LDA). They use not only contexts but also topic information as features of the classifier.…”
Section: Approaches To Semantic Driftmentioning
confidence: 99%
“…Coupling the learning of category extractors by using positive examples of one category as negative examples for others has been shown to help limiting such a decline in accuracy [4]. Also, entity set expansion using topic information can alleviate semantic drift in bootstrapping entity set expansion [8].…”
Section: Related Workmentioning
confidence: 99%
“…It can be considered a Bayesian inference method that, when applied to exponential family models with conjugate priors, can be implemented using exact algorithms that tend to be computationally efficient. Recent studies [8,9] have shown, however, that the direct application of Bayesian Sets may produce poor results in tasks such as information extraction from text. In addition, when Bayesian Sets are applied to problems in which the number of labeled examples is too small, the induced results tend to be deteriorated.…”
Section: Introductionmentioning
confidence: 99%
“…There are a number of papers on various aspects of "set expansion," often for completing lists of entities from structured lists, like those extracted from Wikipedia (Sarmento et al 2007), using rules from natural language processing or topic models (Tran et al 2010;Sadamitsu et al 2011), or from opinion corpora (Zhang and Liu 2011). The task we explore here is web-based set expansion and methods developed for other set expansion tasks are not directly applicable.…”
Section: Related Workmentioning
confidence: 99%