Preemptive information extraction using unrestricted relation discovery

Shinyama, Yusuke; Sekine, Satoshi

doi:10.3115/1220835.1220874

Cited by 148 publications

(91 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The open information extraction paradigm, simultaneously proposed by Shinyama and Sekine (2006) and Banko et al (2007), does not rely on any labeled data or even existing relations. Instead, open information extraction systems only use an unlabeled corpus, and output a set of extracted relations.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

A convex relaxation for weakly supervised relation extraction

Grave¹

2014

Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

A promising approach to relation extraction, called weak or distant supervision, exploits an existing database of facts as training data, by aligning it to an unlabeled collection of text documents. Using this approach, the task of relation extraction can easily be scaled to hundreds of different relationships. However, distant supervision leads to a challenging multiple instance, multiple label learning problem. Most of the proposed solutions to this problem are based on non-convex formulations, and are thus prone to local minima. In this article, we propose a new approach to the problem of weakly supervised relation extraction, based on discriminative clustering and leading to a convex formulation. We demonstrate that our approach outperforms state-of-the-art methods on the challenging dataset introduced by Riedel et al. (2010).

show abstract

Section: Related Workmentioning

confidence: 99%

“…Instead, open information extraction systems only use an unlabeled corpus, and output a set of extracted relations. Such systems are based on clustering (Shinyama and Sekine, 2006) or self-supervision (Banko et al, 2007). One of the limitations of these systems is the fact that they extract uncanonicalized relations.…”

Section: Related Workmentioning

confidence: 99%

A convex relaxation for weakly supervised relation extraction

Grave¹

2014

Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

show abstract

“…One way to reason about KB and surface relations is to cluster the relations: whenever two relations appear in the same cluster, they are treated as synonymous (Hasegawa et al, 2004;Shinyama and Sekine, 2006;Yao et al, 2011;Takamatsu et al, 2011;Min et al, 2012;Akbik et al, 2012;de Lacalle and Lapata, 2013). For example, if "criticizes" and "hates" are clustered together, then we may predict "hates"("Dante", "Catholic Church") from the above fact (which is actually not true).…”

Section: Related Workmentioning

confidence: 99%

CORE: Context-Aware Open Relation Extraction with Factorization Machines

Petroni¹,

Corro²,

Gemulla³

2015

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

We propose CORE, a novel matrix factorization model that leverages contextual information for open relation extraction. Our model is based on factorization machines and integrates facts from various sources, such as knowledge bases or open information extractors, as well as the context in which these facts have been observed. We argue that integrating contextual information-such as metadata about extraction sources, lexical context, or type information-significantly improves prediction performance. Open information extractors, for example, may produce extractions that are unspecific or ambiguous when taken out of context. Our experimental study on a large real-world dataset indicates that CORE has significantly better prediction performance than state-ofthe-art approaches when contextual information is available.

show abstract

“…Following the idea of preemptive Information Extraction (Shinyama and Sekine, 2006), we pre-extract and store all subtrees and entity types from a given corpus for each sentence with at least two named entities. This allows not only fast retrieval of matching entity pairs for a given set of subtrees and type restrictions, but also allows us to compute pattern correlations over the entire dataset for the presently selected setup.…”

Section: Preemptive Pattern Extractionmentioning

confidence: 99%

SCHNAPPER: A Web Toolkit for Exploratory Relation Extraction

Michael

Akbik

2015

Proceedings of ACL-IJCNLP 2015 System Demonstrations

View full text Add to dashboard Cite

We present SCHNÄPPER, a web toolkit for Exploratory Relation Extraction (ERE). The tool allows users to identify relations of interest in a very large text corpus in an exploratory and highly interactive fashion. With this tool, we demonstrate the easeof-use and intuitive nature of ERE, as well as its applicability to large corpora. We show how users can formulate exploratory, natural language-like pattern queries that return relation instances. We also show how automatically computed suggestions are used to guide the exploration process. Finally, we demonstrate how users create extractors with SCHNÄPPER once a relation of interest is identified.

show abstract

Preemptive information extraction using unrestricted relation discovery

Cited by 148 publications

References 7 publications

A convex relaxation for weakly supervised relation extraction

A convex relaxation for weakly supervised relation extraction

CORE: Context-Aware Open Relation Extraction with Factorization Machines

SCHNAPPER: A Web Toolkit for Exploratory Relation Extraction

Contact Info

Product

Resources

About