Event matching is the process of checking high volumes of events against large numbers of subscriptions and is a fundamental issue for the overall performance of a largescale distributed publish/subscribe system. Most existing algorithms are based on counting satisfied component constraints in each subscription. As the scale of a system grows, these algorithms inevitably suffer from performance degradation. We present REIN (REctangle INtersection), a fast event matching approach for large-scale content-based publish/subscribe systems. The idea behind REIN is to quickly filter out unlikely matched subscriptions. In REIN, the event matching problem is first transformed into the rectangle intersection problem. Then, an efficient index structure is designed to address the problem by using bit operations. Experimental results show that REIN has a better matching performance than its counterparts. In particular, the event matching speed is faster by an order of magnitude when the selectivity of subscriptions is high and the number of subscriptions is large.
Content-based publish/subscribe systems have been employed to deal with complex distributed information flows in many applications. It is well recognized that event matching is a fundamental component of such large-scale systems. Event matching searches a space which is composed of all subscriptions. As the scale and complexity of a system grows, the efficiency of event matching becomes more critical to system performance. However, most existing methods suffer significant performance degradation when the system has large numbers of both subscriptions and their component constraints. In this paper, we present H-Tree (Hash Tree), a highly efficient index structure for event matching. H-Tree is a hash table in nature that is a combination of hash lists and hash chaining. A hash list is built up on an indexed attribute by realizing novel overlapping divisions of the attribute's value domain, providing more efficient space consumption. Multiple hash lists are then combined into a hash tree. The basic idea behind H-Tree is that matching efficiencies are improved when the search space is substantially reduced by pruning most of the subscriptions that are not matched. We have implemented H-Tree and conducted extensive experiments in different settings. Experimental results demonstrate that H-Tree has better performance than its counterparts by a large margin. In particular, the matching speed is faster by three orders of magnitude than its counterparts when the numbers of both subscriptions and their component constraints are huge.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.