Answering Imprecise Queries over Autonomous Web Databases

Nambiar, Ullas; Kambhampati, Subbarao

doi:10.1109/icde.2006.20

Cited by 44 publications

(22 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this context, a related issue is handling query imprecision-most users of online databases tend to pose imprecise queries which admit answers with varying degrees of relevance (c.f. [25]). In our ongoing work, we are investigating the issues of simultaneously handling data incompleteness and query imprecision [16].…”

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

Query processing over incomplete autonomous databases: query rewriting using learned data dependencies

et al. 2009

View full text Add to dashboard Cite

Incompleteness due to missing attribute values (aka "null values") is very common in autonomous web databases, on which user accesses are usually supported through mediators. Traditional query processing techniques that focus on the strict soundness of answer tuples often ignore tuples with critical missing attributes, even if they wind up being relevant to a user query. Ideally we would like the mediator to retrieve such possible answers and gauge their relevance by accessing their likelihood of being pertinent answers to the query. The autonomous nature of web databases poses several challenges in realizing this objective. Such challenges include the restricted access privileges imposed on the data, the limited support for query patterns, and the bounded pool of database and network resources in the web environment. We introduce a novel query rewriting and optimization framework QPIAD that tackles these challenges. Our technique involves reformulating the user query based on mined correlations among the database attributes. The reformulated queries are aimed at retrieving the relevant possible answers in addition to the certain answers. QPIAD is able to gauge the relevance of such queries allowing tradeoffs in reducing the costs of database query processing and answer transmission. To support this framework, we develop methods for mining attribute correlations (in terms of Approximate Functional Dependencies), value distributions (in the form of Naïve Bayes Classifiers), and selectivity estimates. We present empirical studies to demonstrate that our approach is able to effectively retrieve relevant possible answers with high precision, high recall, and manageable cost.

show abstract

Section: Resultsmentioning

confidence: 99%

“…Our work has some relations to both query reformulation and query relaxation [25,24] approaches. An important difference is our focus on retrieving tuples with missing values on constrained attributes.…”

Section: Query Reformulation and Relaxationmentioning

confidence: 99%

Query processing over incomplete autonomous databases: query rewriting using learned data dependencies

et al. 2009

View full text Add to dashboard Cite

show abstract

“…Pre-processing 1) The Definition of Attribute Importance: We learn AFDs [8] of a sample dataset to estimate attribute importance. The sample dataset is got via query-based sampling from each data sources.…”

Section: Introductionmentioning

confidence: 99%

An Effective and High-quality Query Relaxation Solution on the Deep Web

Shan

Shen

Nie

et al. 2010

2010 12th International Asia-Pacific Web Conference

View full text Add to dashboard Cite

because the amount of information contained on the Deep Web is much larger than the surface web, how to use it well has become a popular problem to research. When a query is sent to a deep web resource and the data sources return few results or even no result, a proper query relaxation solution should be adopted to get more satisfactory results to users. In this paper, such a query relaxation solution is presented. First, it solves the problem of relaxing attributes which contain multiple key words by value. That is, such attributes are not simply removed in the relaxation, but the query values of the attributes are modified. Second, when a data source returns many result pages, instead of getting all the pages, it evaluates the quality of the results in the current page to decide whether to send another query to fetch the next page. Thus, the number of query times is reduced. Finally, the experimental results demonstrate that both the result quality and the query efficiency are improved. I. INTRODUCTIONWith the rapid expansion of the World Wide Web, there are a large number of web databases which contain a large amount of information hidden behind the web sites. Sending queries to these data sources has become a widely used way for people to get information. However, sometimes users may not be able to send queries which could get results that satisfy them, or even could not get any result at all. Such queries are called failed queries, which means, queries get few results or no result.When the case mentioned above occurs, query relaxation should be adopted in order to get more results. Query relaxation aims to modify the original query and change the constraints to avoid failed queries. Although these results could not satisfy users needs best, users may still accept them by compromise and we could filter the results in the results integration phase.Because results are returned as pages, when we want to get a result page, actually another query is sent to the data source. It s a good idea to evaluate the quality of the result pages which have been got, and if the results are good enough, it is not necessary to get the rest pages. Thus, the query times would be greatly reduced.In this paper, we focus on the query relaxation on the deep web and propose a solution which is both effective and of high quality. When a failed query occurs, a query relaxation is executed automatically to ensure to get some results to users.The contributions of this paper are as follows:

show abstract

“…Rank of the match is gauged by the degree of tree structure match, as well as the degree of corresponding node tag and node value match. Existing approaches [9,3,6] can be plugged in XFinder to measure similarity of tags and values. In this paper, we focus discussion on approximate structural matching between ordered XML data and queries, the unique challenge in XML data processing.…”

Section: Introductionmentioning

confidence: 99%

Approximate Structural Matching over Ordered XML Documents

Agarwal

Oliveras

Chen

2007

11th International Database Engineering and Applications Symposium (IDEAS 2007)

View full text Add to dashboard Cite

There is an increasing need for an XML query engine that not only searches for exact matches to a query but also returns "query-like" structures. We have designed and developed XFinder, an efficient top K tree pattern query evaluation system, which reduces the problem of approximate tree structural matching to a simpler problem of subsequence matching. However, since not all subsequences correspond to valid tree structures, it is expensive to enumerate common subsequences between XML data and query and then filter the invalid ones. XFinder addresses this challenge by detecting and pruning structurally irrelevant subsequence matches as early as possible. Experiments show the efficiency of XFinder on various data and query sets.

show abstract

Answering Imprecise Queries over Autonomous Web Databases

Abstract: Abstract

Cited by 44 publications

References 18 publications

Query processing over incomplete autonomous databases: query rewriting using learned data dependencies

Query processing over incomplete autonomous databases: query rewriting using learned data dependencies

An Effective and High-quality Query Relaxation Solution on the Deep Web

Approximate Structural Matching over Ordered XML Documents

Contact Info

Product

Resources

About