2017
DOI: 10.1007/978-3-319-71249-9_18
|View full text |Cite
|
Sign up to set email alerts
|

SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble

Abstract: Corpus-based set expansion (i.e., finding the "complete" set of entities belonging to the same semantic class, based on a given corpus and a tiny set of seeds) is a critical task in knowledge discovery. It may facilitate numerous downstream applications, such as information extraction, taxonomy induction, question answering, and web search. To discover new entities in an expanded set, previous approaches either make one-time entity ranking based on distributional similarity, or resort to iterative pattern-base… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
102
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 82 publications
(106 citation statements)
references
References 21 publications
(29 reference statements)
0
102
0
Order By: Relevance
“…Meanwhile, they refine the context feature pool by including only those features which are commonly shared by entities in the expanded set. Based on this philosophy, SetExpan [19] develops a context feature selection module to select quality skip-gram features and designs a rank ensemble module to select quality entities. Similarly, SetExpander [13] captures distributional similarity on five different context types and learns a classifier to combine multiple contexts using an additional labeled dataset.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Meanwhile, they refine the context feature pool by including only those features which are commonly shared by entities in the expanded set. Based on this philosophy, SetExpan [19] develops a context feature selection module to select quality skip-gram features and designs a rank ensemble module to select quality entities. Similarly, SetExpander [13] captures distributional similarity on five different context types and learns a classifier to combine multiple contexts using an additional labeled dataset.…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, context dependent similarity benefits set expansion tasks in that it only captures the type-indicative features of entities. We adopt the context dependent similarity function Sim(e i , e j |F ) defined in [19] using the weighted Jaccard similarity measure:…”
Section: Algorithm 1: Cross-seed Parallel Relations Clusteringmentioning
confidence: 99%
See 1 more Smart Citation
“…We focus on corpus-based approaches based on the distributional similarity hypothesis (Harris, 1954). State-of-the-art techniques return the k nearest neighbors around the seed terms as the expanded set, where terms are represented by their co-occurrence or embedding vectors in a training corpus according to different context types, such as linear window context (Pantel et al, 2009;Shi et al, 2010;Rong et al, 2016;Zaheer et al, 2017;Gyllensten and Sahlgren, 2018;Zhao et al, 2018), explicit lists (Roark and Charniak, 1998;Sarmento et al, 2007;He and Xin, 2011), coordinational patterns (Sarmento et al, 2007) and unary patterns (Rong et al, 2016;Shen et al, 2017). In this work, we generalize coordinational patterns, look at additional context types and combine multiple context-type embeddings.…”
Section: Related Workmentioning
confidence: 99%
“…Only the n top scoring contexts will have non-zero values in W , and these get the value f ρ . This notion of weighting contexts is similar to that used in the SetExpan framework (Shen et al, 2017), although the way they use it is different (they use weighted Jaccard similarity based on context weights). Their algorithm for calculating context weights is a special case of our algorithm, with no notion of limited support penalty, that is, they use ρ = 0.…”
Section: Details Of Calculating Wmentioning
confidence: 99%