Proceedings of the 23rd International Conference on World Wide Web 2014
DOI: 10.1145/2567948.2576947
|View full text |Cite
|
Sign up to set email alerts
|

Reachable subwebs for traversal-based query execution

Abstract: Traversal-based approaches to execute queries over data on the Web have recently been studied. These approaches make use of up-todate data from initially unknown data sources and, thus, enable applications to tap the full potential of the Web. While existing work focuses primarily on implementation techniques, a principled analysis of subwebs that are reachable by such approaches is missing. Such an analysis may help to gain new insight into the problem of optimizing the response time of traversal-based query … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2014
2014
2016
2016

Publication Types

Select...
2
2
1

Relationship

3
2

Authors

Journals

citations
Cited by 6 publications
(9 citation statements)
references
References 13 publications
(25 reference statements)
0
9
0
Order By: Relevance
“…The reason for the negative performance of this approach-as well as any other possible purely graph-based approach-is that the applied vertex scoring method rates document and URI vertices only based on graph-specific properties, whereas the result-relevance of reachable documents is independent of such properties. In fact, in our earlier work we show empirically that there does not exist a correlation between the result-relevance-or irrelevance-of reachable documents and the indegree of the corresponding document vertices in the Web graph model (similarly, for the PageRank, the HITS scores, the k-step Markov score, and the betweenness centrality) [9].…”
Section: Evaluation Of the Purely Graph-based Approachesmentioning
confidence: 89%
See 1 more Smart Citation
“…The reason for the negative performance of this approach-as well as any other possible purely graph-based approach-is that the applied vertex scoring method rates document and URI vertices only based on graph-specific properties, whereas the result-relevance of reachable documents is independent of such properties. In fact, in our earlier work we show empirically that there does not exist a correlation between the result-relevance-or irrelevance-of reachable documents and the indegree of the corresponding document vertices in the Web graph model (similarly, for the PageRank, the HITS scores, the k-step Markov score, and the betweenness centrality) [9].…”
Section: Evaluation Of the Purely Graph-based Approachesmentioning
confidence: 89%
“…In an earlier, more detailed analysis of these queries we make the same observation for the other 13 test Webs [9]. Moreover, if we consider each query in separation and compare its reachable …”
Section: Test Queriesmentioning
confidence: 95%
“…SQUIN 22 is an iterator-based implementation of the LTQBE paradigm, while LiDaQ [Umbrich et al 2014] also considers lightweight reasoning extensions to find additional sources on-the-fly and generate further answers. Recently, an analysis of subwebs that are reachable by LTQBE engines was performed [Hartig and Ozsu 2014] giving some hints about why results obtained with these engines are not complete. We discuss the differences between LTQBE engines and swget both in terms of scope and expressiveness.…”
Section: Related Workmentioning
confidence: 99%
“…It is now well-known that the cost of data retrieval dominates the cost of link-traversal query execution [25]. That is why the query engine's ability to entail that it does not need to exhaustively explore the entire reachable subweb is essential for efficient query execution.…”
Section: Examplesmentioning
confidence: 99%
“…At each iteration, we will experimentally validate the performance of the optimized engine against the base-line version of the same engine. For this purpose, we have extended the experimental setup used in[25].4 https://jena.apache.org 5 http://squin.org…”
mentioning
confidence: 99%