TAXI at SemEval-2016 Task 13: a Taxonomy Induction Method based on Lexico-Syntactic Patterns, Substrings and Focused Crawling

Panchenko, Alexander; Faralli, Stefano; Ruppert, Eugen; Remus, Steffen; Naets, Hubert; Fairon, Cédrick; Ponzetto, Simone Paolo; Biemann, Chris

doi:10.18653/v1/s16-1206

Cited by 58 publications

(83 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…TAXI The methods for hypernym identification used in the TAXonomy Induction system (TAXI) rely on two sources of evidence: substring matching Table 4: Manual evaluation of 100 (at most) randomly selected novel relations based on precision for English and Hearst-like patterns (Panchenko et al, 2016). The Hearst patterns for all languages are extracted from Wikipedia and from focused crawls with seed pages that are Wikipedia pages.…”

Section: Participants and Resultsmentioning

confidence: 99%

SemEval-2016 Task 13: Taxonomy Extraction Evaluation (TExEval-2)

Bordea¹,

Lefever²,

Buitelaar³

2016

Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

126

View full text Add to dashboard Cite

This paper describes the second edition of the shared task on Taxonomy Extraction Evaluation organised as part of SemEval 2016. This task aims to extract hypernym-hyponym relations between a given list of domain-specific terms and then to construct a domain taxonomy based on them. TExEval-2 introduced a multilingual setting for this task, covering four different languages including English, Dutch, Italian and French from domains as diverse as environment, food and science. A total of 62 runs submitted by 5 different teams were evaluated using structural measures, by comparison with gold standard taxonomies and by manual quality assessment of novel relations.

show abstract

Section: Participants and Resultsmentioning

confidence: 99%

SemEval-2016 Task 13: Taxonomy Extraction Evaluation (TExEval-2)

Bordea¹,

Lefever²,

Buitelaar³

2016

Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

126

View full text Add to dashboard Cite

show abstract

“…In both task, two pattern-based methods (i.e., IN-RIASAC (Grefenstette, 2015) in TExEval and TAXI (Panchenko et al, 2016) in TExEval-2) consistently outperform others. INRIASAC uses frequency-based co-occurrence statistics, and substring inclusion heuristics to extract a set of hypernyms for hyponyms.…”

Section: Results Analysis and Discussionmentioning

confidence: 99%

A Short Survey on Taxonomy Learning from Text Corpora: Issues, Resources and Recent Advances

Wang¹,

He²,

Zhou³

2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

A taxonomy is a semantic hierarchy, consisting of concepts linked by is-a relations. While a large number of taxonomies have been constructed from human-compiled resources (e.g., Wikipedia), learning taxonomies from text corpora has received a growing interest and is essential for longtailed and domain-specific knowledge acquisition. In this paper, we overview recent advances on taxonomy construction from free texts, reorganizing relevant subtasks into a complete framework. We also overview resources for evaluation and discuss challenges for future research.

show abstract

“…• Normalized Frequency Diff (n d ): Similar to [28], this feature is an asymmetric hypernymy score based on frequency counts. We compute n d (x i , x j ) by first normalizing the frequency counts obtained (i.e., the counts in E k (x i )) for term x i as follows:…”

Section: Initial Subsequences Mortadella→sausage→meat→food Laksa→soupmentioning

confidence: 99%

“…Past approaches to taxonomy induction from scratch either assume the availability of a clean input vocabulary [28] or employ a time-consuming manual cleaning step over a noisy input vocabulary [38]. For example, Figure 1 shows the pipeline of a typical taxonomy induction approach from a domain corpus [38].…”

Section: Introductionmentioning

confidence: 99%

Taxonomy Induction Using Hypernym Subsequences

Gupta

Lebret

Harkous

et al. 2017

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

View full text Add to dashboard Cite

We propose a novel, semi-supervised approach towards domain taxonomy induction from an input vocabulary of seed terms. Unlike all previous approaches, which typically extract direct hypernym edges for terms, our approach utilizes a novel probabilistic framework to extract hypernym subsequences. Taxonomy induction from extracted subsequences is cast as an instance of the minimumcost flow problem on a carefully designed directed graph. Through experiments, we demonstrate that our approach outperforms stateof-the-art taxonomy induction approaches across four languages. Importantly, we also show that our approach is robust to the presence of noise in the input vocabulary. To the best of our knowledge, this robustness has not been empirically proven in any previous approach.

show abstract

TAXI at SemEval-2016 Task 13: a Taxonomy Induction Method based on Lexico-Syntactic Patterns, Substrings and Focused Crawling

Cited by 58 publications

References 22 publications

SemEval-2016 Task 13: Taxonomy Extraction Evaluation (TExEval-2)

SemEval-2016 Task 13: Taxonomy Extraction Evaluation (TExEval-2)

A Short Survey on Taxonomy Learning from Text Corpora: Issues, Resources and Recent Advances

Taxonomy Induction Using Hypernym Subsequences

Contact Info

Product

Resources

About