Data stream processing applications have a long running nature (24hr/7d) with workload conditions that may exhibit wide variations at run-time. Elasticity is the term coined to describe the capability of applications to change dynamically their resource usage in response to workload fluctuations. This paper focuses on strategies for elastic data stream processing targeting multicore systems. The key idea is to exploit Model Predictive Control, a control-theoretic method that takes into account the system behavior over a future time horizon in order to decide the best reconfiguration to execute. We design a set of energy-aware proactive strategies, optimized for throughput and latency QoS requirements, which regulate the number of used cores and the CPU frequency through the Dynamic Voltage and Frequency Scaling (DVFS) support offered by modern multicore CPUs. We evaluate our strategies in a high-frequency trading application fed by synthetic and real-world workload traces. We introduce specific properties to effectively compare different elastic approaches, and the results show that our strategies are able to achieve the best outcome
High-level parallel programming is an active research topic aimed at promoting parallel programming methodologies that provide the programmer with high-level abstractions to develop complex parallel software with reduced time to solution. Pattern-based parallel programming is based on a set of composable and customizable parallel patterns used as basic building blocks in parallel applications. In recent years, a considerable effort has been made in empowering this programming model with features able to overcome shortcomings of early approaches concerning flexibility and performance. In this article, we demonstrate that the approach is flexible and efficient enough by applying it on 12 out of 13 PARSEC applications. Our analysis, conducted on three different multicore architectures, demonstrates that pattern-based parallel programming has reached a good level of maturity, providing comparable results in terms of performance with respect to both other parallel programming methodologies based on pragma-based annotations (i.e., OpenMP and OmpSs) and native implementations (i.e., Pthreads). Regarding the programming effort, we also demonstrate a considerable reduction in lines of code and code churn compared to Pthreads and comparable results with respect to other existing implementations. We extended this work by (i) modeling and implementing an additional seven applications by using parallel patterns (there were five in Danelutto et al. [21]), (ii) implementing four of these applications with another framework (SkePU), (iii) comparing our approach to the OmpSs programming model, (iv) adding other programmability metrics to our analysis, and (v) testing our work on two additional and completely different architectures.
The topic of Data Stream Processing is a recent and highly active research area dealing with the in-memory, tuple-by-tuple analysis of streaming data. Continuous queries typically consume huge volumes of data received at a great velocity. Solutions that persistently store all the input tuples and then perform off-line computation are impractical. Rather, queries must be executed continuously as data cross the streams. The goal of this paper is to present parallel patterns for window-based stateful operators, which are the most representative class of stateful data stream operators. Parallel patterns are presented “à la” Algorithmic Skeleton, by explaining the rationale of each pattern, the preconditions to safely apply it, and the outcome in terms of throughput, latency and memory consumption. The patterns have been implemented in the framework targeting off-the-shelf multicores. To the best of our knowledge this is the first time that a similar effort to merge the Data Stream Processing domain and the field of Structured Parallelism has been made
Adaptiveness in distributed parallel applications is a key feature to provide satisfactory performance results in the face of unexpected events such as workload variations and time-varying user requirements. The adaptation process is based on the ability to change specific characteristics of parallel components (e.g., their parallelism degree) and to guarantee that such modifications of the application configuration are effective and durable. Reconfigurations often incur a cost on the execution (a performance overhead and/or an economic cost). For this reason advanced adaptation strategies have become of paramount importance. Effective strategies must achieve properties like control optimality (making decisions that optimize the global application QoS), reconfiguration stability expressed in terms of the average time between consecutive reconfigurations of the same component, and optimizing the reconfiguration amplitude (number of allocated/deallocated resources). To control such parameters, in this article we propose a method based on a Cooperative Model-based Predictive Control approach in which application controllers cooperate to make optimal reconfigurations and taking account of the durability and amplitude of their control decisions. The effectiveness and the feasibility of the methodology is demonstrated through experiments performed in a simulation environment and by comparing it with other existing techniques
Distributed data stream processing applications are structured as graphs of interconnected modules able to ingest high-speed data and to transform them in order to generate results of interest. Elasticity is one of the most appealing features of stream processing applications. It makes it possible to scale up/down the allocated computing resources on demand in response to fluctuations of the workload. On clouds, this represents a necessary feature to keep the operating cost at affordable levels while accommodating user-defined QoS requirements. In this article, we study this problem from a game-theoretic perspective. The control logic driving elasticity is distributed among local control agents capable of choosing the right amount of resources to use by each module. In a first step, we model the problem as a noncooperative game in which agents pursue their self-interest. We identify the Nash equilibria and we design a distributed procedure to reach the best equilibrium in the Pareto sense. As a second step, we extend the noncooperative formulation with a decentralized incentive-based mechanism in order to promote cooperation by moving the agreement point closer to the system optimum. Simulations confirm the results of our theoretical analysis and the quality of our strategies
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.