2013
DOI: 10.1145/2516633.2516638
|View full text |Cite
|
Sign up to set email alerts
|

The parallel path framework for entity discovery on the web

Abstract: It has been a dream of the database and Web communities to reconcile the unstructured nature of the World Wide Web with the neat, structured schemas of the database paradigm. Even though databases are currently used to generate Web content in some sites, the schemas of these databases are rarely consistent across a domain. This makes the comparison and aggregation of information from different domains difficult. We aim to make an important step towards resolving this disparity by using the structural and relat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 36 publications
(46 reference statements)
0
3
0
Order By: Relevance
“…This is the case of hyperlinks used to enforce the Web page authority in a link-based ranking scenario, short-cut hyperlinks, etc. The solution we propose, based on the usage of Web lists, has a twofold effect: on the one hand, it guarantees that only hyperlinks which may belong to potential navigation systems are considered; on the other hand, it allows the method to identify hyperlinks by implicitly taking into account the Web page structure codified in the Web lists available in the Web pages [5,22,37,42], even if the hyperlinks do not belong to the navigation system. The crawling algorithm is described in Algorithm 1.…”
Section: Website Crawlingmentioning
confidence: 99%
“…This is the case of hyperlinks used to enforce the Web page authority in a link-based ranking scenario, short-cut hyperlinks, etc. The solution we propose, based on the usage of Web lists, has a twofold effect: on the one hand, it guarantees that only hyperlinks which may belong to potential navigation systems are considered; on the other hand, it allows the method to identify hyperlinks by implicitly taking into account the Web page structure codified in the Web lists available in the Web pages [5,22,37,42], even if the hyperlinks do not belong to the navigation system. The crawling algorithm is described in Algorithm 1.…”
Section: Website Crawlingmentioning
confidence: 99%
“…Also, there are solutions that attempt to induce ontologies from natural language text [20,37,49]. The examples of IE for semi-structured texts are described in [26,32,34,50,55,61,76]. In the case of databasesthis reason, the author assumes that it is important to formally define the term IS to better understand the rest of this article and its role in IES.…”
Section: State Of the Art And Related Workmentioning
confidence: 99%
“…Hyperlinks are typically assumed to be intentional and reflective of some organizational structure -a structure typically decided by the organization and implemented by a Web administrator (Weninger, Johnston, and Han 2013). But what happens when information structures and communities are defined by the users themselves?…”
Section: Introductionmentioning
confidence: 99%