1999
DOI: 10.1007/10704656_11
|View full text |Cite
|
Sign up to set email alerts
|

Extracting Patterns and Relations from the World Wide Web

Abstract: Abstract. The World Wide Web is a vast resource for information.At the same time it is extremely distributed. A particular type of data such as restaurant lists may be scattered across thousands of independent information sources in many di erent formats. In this paper, we consider the problem of extracting a relation for such a data type from all of these sources automatically. W e present a technique which exploits the duality between sets of patterns and relations to grow the target relation starting from a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
401
0
6

Year Published

2001
2001
2010
2010

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 613 publications
(424 citation statements)
references
References 1 publication
0
401
0
6
Order By: Relevance
“…In traditional relation extraction, the sources of entities usually come from terms in unstructured documents such as Web pages or structured documents such as relational databases. A wide variety of data sources have been used in relation extraction research, e.g., Web pages (Brin, 1998), corpus (Bunescu & Mooney, 2007), and socially generated Wikipedia articles (Nguyen et al, 2007). The semantic and linguistic sources for exploring relations can be a corpus containing the context of entities, and this context information can serve as the basis of relation assignment.…”
Section: Literature Reviewmentioning
confidence: 99%
See 1 more Smart Citation
“…In traditional relation extraction, the sources of entities usually come from terms in unstructured documents such as Web pages or structured documents such as relational databases. A wide variety of data sources have been used in relation extraction research, e.g., Web pages (Brin, 1998), corpus (Bunescu & Mooney, 2007), and socially generated Wikipedia articles (Nguyen et al, 2007). The semantic and linguistic sources for exploring relations can be a corpus containing the context of entities, and this context information can serve as the basis of relation assignment.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Machine learning algorithms are implemented to learn features of relations, and assign relations to entities whose relations are not yet known. Text relation extraction also involves entity extraction for identifying entities or concepts (Brin, 1998;Iria & Ciravegna, 2005;Nguyen et al, 2007;Roth & Yih, 2002).…”
Section: Literature Reviewmentioning
confidence: 99%
“…Instead of a single deterministic run, the algorithm runs continuously exploring more and more sites. In [19], the extraction of structured data is achieved from information offered by unstructured data on the web. The example used is to search for books in the web starting from a small sample of books from which a pattern is extracted.…”
Section: Mining the Webmentioning
confidence: 99%
“…Brin proposed a method called Dual Iterative Pattern Relation Extraction (DIPRE) in his paper from 1998 [2]. He tested the method on part of his Google corpus-which at the time consisted of about 24 million web pages-to learn patterns that link authors to titles of their books.…”
Section: Related Workmentioning
confidence: 99%
“…In practice different data sources are often annotated with different ontologies. 2 In order to provide integrated access using multiple ontologies, some form of ontology mapping needs to be done.…”
Section: Introductionmentioning
confidence: 99%