Abstract. XML message filtering systems are used for sifting through real-time messages to support business data mining and reporting. An XML message filtering system needs to (a) process registered filter predicates on multiple distributed real-time streams and (b) match and validate the filter results with local data to identify the relevant data that can be used for higher-level processing. Although efficient real-time filtering schemes exists, the matching phase of the operation where filter results have to be matched against local data to select those matches that are relevant to the particular task remains to be expensive as it requires expensive join operations. In this paper, we present an efficient middleware (FMware) for filtering and matching XML messages against locally available data. The proposed operator relies on a novel clusterdomain matching scheme to reduce the cost of the process. We analytically study the cost of the proposed middleware and experimentally show that it adaptively reduces the number of local data accesses and provides large savings in matching time with respect to cluster-unaware matching.
As part of our iCare efforts, we are developing mechanisms that provide guidance to individuals who are blind in diverse contexts. A fundamental challenge in this context is to represent and index experiences that can be used to provide recommendations. In this paper, we address the challenge of indexing experiences in order to retrieve them based on their popularities. In particular, we model experiences as sequences of propositional statements from a particular domain (daily life, web browsing, etc.). We then show that knowledge about domain constraints (such as commutativity between possible statements) need to be used for clustering and indexing experiences for popularity-search. We also highlight that don't cares (propositional statements not relevant to the user's query) make the task of popularity indexing challenging. Thus, we develop a canonicalsequence based approach that significantly reduces the experience sequence retrieval time in the presence of commutations. We introduce rule-compression, which helps achieve further reductions in the retrieval cost. We propose a novel two-level index structure, EXPdex, to efficiently answer wildcard (don't care) queries. We compare the proposed approach analytically and experimentally to a don't careunaware solution, which does not take into account wildcards in queries while constructing the popularity index. Experiments show that the proposed approach provides large savings in retrieval time when commutations between the elements of sequences are allowed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.