In the selective dissemination of information (or publish/ subscribe) paradigm, clients subscribe to a server with continuous queries (or profiles) that express their information needs. Clients can also publish documents to servers. Whenever a document is published, the continuous queries satisfying this document are found and notifications are sent to appropriate clients. This paper deals with the filtering problem that needs to be solved efficiently by each server: Given a database of continuous queries db and a document d, find all queries q ∈ db that match d. We present data structures and indexing algorithms that enable us to solve the filtering problem efficiently for large databases of queries expressed in the model AWP which is based on named attributes with values of type text, and word proximity operators.
In the information filtering paradigm, clients subscribe to a server with continuous queries or profiles that express their information needs. Clients can also publish documents to servers. Whenever a document is published, the continuous queries satisfying this document are found and notifications are sent to appropriate clients. This article deals with the filtering problem that needs to be solved efficiently by each server: Given a database of continuous queries db and a document d , find all queries q ∈ db that match d . We present data structures and indexing algorithms that enable us to solve the filtering problem efficiently for large databases of queries expressed in the model AWP. AWP is based on named attributes with values of type text, and its query language includes Boolean and word proximity operators.
Abstract. This paper presents P2P-DIET, an implemented resource sharing system that unifies one-time and continuous query processing in super-peer networks. P2P-DIET offers a simple data model for the description of network resources based on attributes with values of type text and a query language based on concepts from Information Retrieval. The focus of this paper is on the main modelling concepts of P2P-DIET (metadata, advertisements and queries), the routing algorithms (inspired by the publish/subscibe system SIENA) and the scalable indexing of resource metadata and queries.
Abstract-Over the past few years, Peer-to-Peer (P2P) systems have become very popular for constructing overlay networks of many nodes (peers) that allow users geographically distributed to share data and resources. One non-trivial question is how to distribute the data in a fair and fully decentralized manner among the peers. This is important because it can improve resource usage, minimize network latencies and reduce the volume of unnecessary traffic incurred in large-scale P2P systems. In this paper we present a technique for fair resource allocation in unstructured Peer-to-Peer systems. Our technique uses the Fairness Index of a distribution as a measure of fairness and shows how to optimize the fairness of the distribution using only local decisions. Load balancing is achieved by replicating documents across multiple nodes in the system. Our experimental results demonstrate that our technique is scalable, has low overhead and achieves good load balance even under skewed demand.
We study the problem of selective dissemination of information in P2P networks. We present our work on data models and laiguages for textual information dissemination and discuss a relemnt P2P architecture that motivates our efforts. We also survey our results on the computational complexity of three related algorithmic problems (query satisfiability, entailment and filtering) and present efficient algorithms for the most crucial of these problems (filtering). Finally, we discuss the features of P2P-DIET, a super-peer system we have implemented at the Technical Lniversity of Crete, that realizes our vision and is able to support both ad-hoc querying and selective information dissemination scenarios in a P2P framework.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.