We consider the problem of efficiently indexing Disjunctive Normal Form (DNF) and Conjunctive Normal Form (CNF) Boolean expressions over a high-dimensional multi-valued attribute space. The goal is to rapidly find the set of Boolean expressions that evaluate to true for a given assignment of values to attributes. A solution to this problem has applications in online advertising (where a Boolean expression represents an advertiser's user targeting requirements, and an assignment of values to attributes represents the characteristics of a user visiting an online page) and in general any publish/subscribe system (where a Boolean expression represents a subscription, and an assignment of values to attributes represents an event). All existing solutions that we are aware of can only index a specialized sub-set of conjunctive and/or disjunctive expressions, and cannot efficiently handle general DNF and CNF expressions (including NOTs) over multi-valued attributes.In this paper, we present a novel solution based on the inverted list data structure that enables us to index arbitrarily complex DNF and CNF Boolean expressions over multi-valued attributes. An interesting aspect of our solution is that, by virtue of leveraging inverted lists traditionally used for ranked information retrieval, we can efficiently return the top-N matching Boolean expressions. This capability enables emerging applications such as ranked publish/subscribe systems [16], where only the top subscriptions that match an event are desired. For example, in online advertising there is a limit on the number of advertisements that can be shown on a given page and only the "best" advertisements can be displayed. We have evaluated our proposed technique based on data from an online advertising application, and the results show a dramatic performance improvement over prior techniques.
Warehouse views need to be updated when source data changes.
tsimmis 1 OverviewIn order to access information from a variety of heterogeneous information sources, one has to be able to translate queries and data from one data model into another.This functionality is provided by so-called (source) wrappers [4,8] which convert queries into one or more commands/queries understandable by the underlying source and transform the native results into a format understood by the application. As part of the TSIMMISproject [1,6] we have developed hard-coded wrappers for a variety of sources (e.g., Sybase DBMS, W WW pages, etc.) including legacy systems (Folio). However, anyone who has built a wrapper before can attest that a lot of effort goos into developing and writing such a wrapper. In situations where it is important or desirable to gain access to new sources quicldy, this is a major drawback. Furthermore, we have also observed that only a relatively small part of the code deals with the specific access details of the source. The rest of the code is either common among wrappers or implements query and data transformation that could be expressed in a high level, declarative fashion.Based on these observations, we have developed a wrapper implementation toolkit [7] for quickly building wrappers. The toolkit contains a library for commonly used functions, such as for receiving queries from the application and packaging results. It also ' Permission to make digitellhard copy of part or all this work for personal or clacsroom use is granted without fee provided that contains a facility for translating queries into sourcespecific commands, and for translating results into a model useful to the application.The philosophy behind our "template-baaed" translation methodology is as follows. The wrapper implementor specifies a set of templates (rules) written in a high level declarative language that describe the queries accepted by the wrapper as well as the objects that it returns. If an application query matches a template, an implementorprovided action associated with the template is executed to rovide the native query for the underly-F ing source . When the source returns the result of the query, the wrapper transforms the answer which is represented in the data model of the source into a representation that is used by the application. Using this toolkit one can quicldy design a simple wrapper with a few templates that cover some of the desired functionality, probably the one that is most urgently needed. However, templates can be added gradually as more functionality is required later on.Another important use of wrappers is in extending the query capabilities of a source. For instance, some sources may not be capable of answering queries that have multiple predicates. In such cases, it is necessary to pose a native query to such a source using only predicates that the source is capable of handling. The rest of the predicates are automatically separated from the user query and form a jilter query.When the wrapper receives the results, a poet-processing engine applies the filter query, ...
Existing data-integration systems based on the mediation architecture employ a v ariety of mechanisms to describe the query-processing capabilities of sources. However, these systems do not compute the capabilities of the mediators based on the capabilities of the sources they integrate. In this paper, we propose a framework to capture a rich v ariety of query-processing capabilities of data sources and mediators. We present algorithms to compute the set of supported queries of a mediator, based on the capability limitations of its sources. Our algorithms take i n to consideration a variety of query-processing techniques employed by mediators to enhance the set of supported queries.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.