The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2014
DOI: 10.3115/v1/d14-1208
|View full text |Cite
|
Sign up to set email alerts
|

Noisy Or-based model for Relation Extraction using Distant Supervision

Abstract: Distant supervision, a paradigm of relation extraction where training data is created by aligning facts in a database with a large unannotated corpus, is an attractive approach for training relation extractors. Various models are proposed in recent literature to align the facts in the database to their mentions in the corpus. In this paper, we discuss and critically analyse a popular alignment strategy called the "at least one" heuristic. We provide a simple, yet effective relaxation to this strategy. We formu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2015
2015
2017
2017

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 4 publications
(7 reference statements)
0
6
0
Order By: Relevance
“…While supervised entity recognition systems [14,34] focus on a few common entity types, weakly-supervised methods [18,36] and distantly-supervised methods [41,54,26] use large text corpus and a small set of seeds (or a knowledge base) to induce patterns or to train models, and thus can apply to different domains without additional human annotation labor. For relation extraction, similarly, weak supervision [6,13] and distant supervision [35,53,49,21,43,31] approaches are proposed to address the domain restriction issue in traditional supervised systems [2,33,17]. However, such a "pipeline" diagram ignores the dependencies between different sub tasks and may suffer from error propagation between the tasks.…”
Section: Related Workmentioning
confidence: 99%
“…While supervised entity recognition systems [14,34] focus on a few common entity types, weakly-supervised methods [18,36] and distantly-supervised methods [41,54,26] use large text corpus and a small set of seeds (or a knowledge base) to induce patterns or to train models, and thus can apply to different domains without additional human annotation labor. For relation extraction, similarly, weak supervision [6,13] and distant supervision [35,53,49,21,43,31] approaches are proposed to address the domain restriction issue in traditional supervised systems [2,33,17]. However, such a "pipeline" diagram ignores the dependencies between different sub tasks and may suffer from error propagation between the tasks.…”
Section: Related Workmentioning
confidence: 99%
“…Besides, Fan et al [24] presented a novel framework by integrating active learning and weakly supervised learning. Nagesh et al [25] solved the label assigning problem with integer linear programming (ILP) and improved the baselines. In addition, there are some deep learning based methods using convolutional neural networks to do feature modeling and MIL to do distant supervision [26].…”
Section: Distant Supervision For Relation Extractionmentioning
confidence: 99%
“…Most aforementioned work used SIL, MIL, or MIML to train classifiers, which set strong baselines in this field. In addition, recent researches also include embedding based models that transferred the relation extraction problem into a translation model like ℎ + ≈ [22][23][24], nonnegative matrix factorization (NMF) models [8,9] with the characteristics of training and testing jointly, integrating active learning and weakly supervised learning [25], integer linear programming (ILP) [26], and so on.…”
Section: Distant Supervision For Relation Extractionmentioning
confidence: 99%
“…We test on the KBP dataset, one of the benchmark datasets in this literature constructed by Surdeanu et al [4]. The resources are mainly from the TAC KBP 2010 and 2011 slot filling shared tasks [25,26] which contain 183,062 and 3,334 entity pairs for training and testing. The free texts come from the collection provided by the shared task, which contains approximately 1.5 million documents from a variety of sources, including newswire, blogs, and telephone conversation transcripts.…”
Section: Dataset Descriptionmentioning
confidence: 99%