Abstract:Stream applications gained significant popularity over the last years that lead to the development of specialized stream engines. These systems are designed from scratch with a different philosophy than nowadays database engines in order to cope with the stream applications requirements. However, this means that they lack the power and sophisticated techniques of a full fledged database system that exploits techniques and algorithms accumulated over many years of database research.In this paper, we take the op… Show more
“…The potential of database systems in efficient processing of continuous queries over streaming data has been explored in [19,23,10]. Authors in [19] showed that, the performance of stream processing in a standard relational database can be improved significantly by appropriate tunning and use of existing features like indices and temporary tables.…”
Section: Related Workmentioning
confidence: 99%
“…Authors in [19] showed that, the performance of stream processing in a standard relational database can be improved significantly by appropriate tunning and use of existing features like indices and temporary tables. The work of [23,10] presented how to extend the MonetDB and PostgreSQL database systems to support stream processing, respectively. The major motivation of these works is that, by building SPEs separately from database systems, the opportunity of leveraging the existing sophisticated algorithms and techniques of databases is lost.…”
Over the last few years, the increasing demand on processing streaming data with high throughput and low latency has led to the development of specialized stream processing engines (SPE). Although existing SPEs show high performance in evaluating stateless operations and stateful operations with small windows, their performance degrades significantly when calculating exact answers for complex aggregate queries with huge windows. Examples include correlated aggregations, quantile and ordering statistic computation. Meanwhile, modern database systems have demonstrated the ability of processing complex analytical tasks efficiently over very large datasets, using technologies such as vertical storage, vectorized query execution, etc. This suggests the feasibility of leveraging database systems to assist SPEs to process complex aggregate queries to reduce their evaluation latency.The goal of this thesis is to investigate the potential of combining database systems with SPEs in the context of stream processing so as to improve the overall query evaluation performance. To this end, the following two major topics will be addressed in this thesis: (1) dynamic migration of complex aggregate operations between the SPE and the database in response to varying system load and (2) efficient evaluation of continuous queries over streaming data that is migrated to the database.
“…The potential of database systems in efficient processing of continuous queries over streaming data has been explored in [19,23,10]. Authors in [19] showed that, the performance of stream processing in a standard relational database can be improved significantly by appropriate tunning and use of existing features like indices and temporary tables.…”
Section: Related Workmentioning
confidence: 99%
“…Authors in [19] showed that, the performance of stream processing in a standard relational database can be improved significantly by appropriate tunning and use of existing features like indices and temporary tables. The work of [23,10] presented how to extend the MonetDB and PostgreSQL database systems to support stream processing, respectively. The major motivation of these works is that, by building SPEs separately from database systems, the opportunity of leveraging the existing sophisticated algorithms and techniques of databases is lost.…”
Over the last few years, the increasing demand on processing streaming data with high throughput and low latency has led to the development of specialized stream processing engines (SPE). Although existing SPEs show high performance in evaluating stateless operations and stateful operations with small windows, their performance degrades significantly when calculating exact answers for complex aggregate queries with huge windows. Examples include correlated aggregations, quantile and ordering statistic computation. Meanwhile, modern database systems have demonstrated the ability of processing complex analytical tasks efficiently over very large datasets, using technologies such as vertical storage, vectorized query execution, etc. This suggests the feasibility of leveraging database systems to assist SPEs to process complex aggregate queries to reduce their evaluation latency.The goal of this thesis is to investigate the potential of combining database systems with SPEs in the context of stream processing so as to improve the overall query evaluation performance. To this end, the following two major topics will be addressed in this thesis: (1) dynamic migration of complex aggregate operations between the SPE and the database in response to varying system load and (2) efficient evaluation of continuous queries over streaming data that is migrated to the database.
“…In the DataCell project [14] we take a different route by designing a stream engine on top of an existing relational database kernel [15]. This includes reuse of both its storage/execution engine and its optimizer infrastructure.…”
“…Following this policy, Gigascope regularly generates punctuations (''heartbeats'') in order to unblock operators like aggregation in query plans. Recently, a stream engine built on top of a column-oriented DBMS was presented in [20], proposing transitory storage of incoming tuples in suitable system tables called baskets. Items are then propagated to operators after filtering by means of predicate windows, which apply simple selection conditions on the basket data, even irrespective of their timestamp order.…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.