In this paper we focus on Complex Event Processing (CEP) applications where the data is generated by sites that are geographically dispersed across large regions. This geographic distribution, combined with the size of the collected data, imposes severe communication and computation challenges. To attack these challenges, we propose a novel approach for geographically distributed CEP, which combines algorithmic and systems contributions. At an algorithmic level, our work combines an in-network processing approach, which pushes parts of the processing (i.e., CEP operators) towards the sources of their input events, along with a push-pull paradigm, in order to reduce the amount of communicated events. We present optimal (but computationally expensive) solutions which seek to minimize the maximum bandwidth consumption given input latency constraints for detecting events, as well as efficient greedy and heuristic algorithmic variations for our problem. At a systems level, we explain how existing CEP engines can support, with minimal modifications, our algorithms. Our experimental evaluation, using mainly real data sets and network topologies, demonstrates that the power of our techniques lies in the combination of the in-network with the pushpull paradigm, thus allowing our algorithms to significantly outperform related centralized push-pull or conventional in-network processing approaches.
Many Big Data technologies were built to enable the processing of human generated data, setting aside the enormous amount of data generated from Machine-to-Machine (M2M) interactions. M2M interactions create real-time data streams that are much more structured, often in the form of series of event occurrences. In this paper, we provide an overview on the main research issues confronted by existing Complex Event Processing (CEP) techniques, as a starting point for Big Data applications that enable the monitoring of complex event occurrences in M2M interactions.
In this demo, we present FERARI, a prototype that enables realtime Complex Event Processing (CEP) for large volume event data streams over distributed topologies. Our prototype constitutes, to our knowledge, the first complete, multi-cloud based end-to-end CEP solution incorporating: a) a user-friendly, web-based query authoring tool, (b) a powerful CEP engine implemented on top of a streaming cloud platform, (c) a CEP optimizer that chooses the best query execution plan with respect to low latency and/or reduced inter-cloud communication burden, and (d) a query analytics dashboard encompassing graph and map visualization tools to provide a holistic picture with respect to the detected complex events to final stakeholders. As a proof-of-concept, we apply FERARI to enable mobile fraud detection over real, properly anonymized, telecommunication data from T-Hrvatski Telekom network in Croatia.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.