2012
DOI: 10.1007/978-3-642-33486-3_20
|View full text |Cite
|
Sign up to set email alerts
|

Coupled Bayesian Sets Algorithm for Semi-supervised Learning and Information Extraction

Abstract: Our inspiration comes from Nell (Never Ending Language Learning), a computer program running at Carnegie Mellon University to extract structured information from unstructured web pages. We consider the problem of semi-supervised learning approach to extract category instances (e.g. country(USA), city(New York)) from web pages, starting with a handful of labeled training examples of each category or relation, plus hundreds of millions of unlabeled web documents. Semisupervised approaches using a small number of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
13
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(13 citation statements)
references
References 9 publications
0
13
0
Order By: Relevance
“…CBS algorithm by Verma and Hruschka [11] also addresses the problem of extracting noun phrases to populate category instances. Likewise, it adapts a semi-supervised approach and it uses the co-occurrence statistics between nouns and contexts.…”
Section: Nell Cpl Cbs and Beyondmentioning
confidence: 99%
See 3 more Smart Citations
“…CBS algorithm by Verma and Hruschka [11] also addresses the problem of extracting noun phrases to populate category instances. Likewise, it adapts a semi-supervised approach and it uses the co-occurrence statistics between nouns and contexts.…”
Section: Nell Cpl Cbs and Beyondmentioning
confidence: 99%
“…Those candidates are then ranked and the uppermost ones are promoted to be the trusted category instances, which will then be used as seeds in the following iterations. An in-depth discussion on Bayesian Sets and Coupled Bayesian Sets is available in [11].…”
Section: Nell Cpl Cbs and Beyondmentioning
confidence: 99%
See 2 more Smart Citations
“…Systems for learning categories and relations of entities on the web, like the Never-Ending Language Learner (NELL) system (Carlson et al 2010a,b;Verma and Hruschka 2012), or KnowItAll (Etzioni et al 2005) can be used to construct lists but require extensive preprocessing. We do not preprocess, instead we perform information extraction online, deterministically, and virtually instantaneously given access to a search engine.…”
Section: Related Workmentioning
confidence: 99%