Abstract:Modern real-time business analytic consist of heterogeneous workloads (e.g., database queries, graph processing, and machine learning). These analytic applications need programming environments that can capture all aspects of the constituent workloads (including data models they work on and movement of data across processing engines). Polystore systems suit such applications; however, these systems currently execute on CPUs and the slowdown of Moore's Law means they cannot meet the performance and efficiency r… Show more
“…On the other hand, Polyphony-DB [Vogt et al 2018] conceptualizes a selfadaptive system with data replication and partitioning. [Singhal et al 2019] also presents the building blocks of Polystore++, which envisions a highly performance-oriented polystore solution.…”
Multi-model, federated and polystore architectures allow for querying data from different sources through a unified interface, providing interoperability for databases. However, support for blockchain-based databases remains scarce. MOON is a middleware designed to enable cross-model querying of data in relational and blockchain databases through standard SQL syntax. This paper aims to promote the interoperability of blockchain-based and relational database systems through a new approach, called Inter-MOON. Through experimentation, Inter-MOON was found to offer near-total support for SQL DML query syntax, be up to 10x faster than MOON, and show comparable performance to similar tools.
“…On the other hand, Polyphony-DB [Vogt et al 2018] conceptualizes a selfadaptive system with data replication and partitioning. [Singhal et al 2019] also presents the building blocks of Polystore++, which envisions a highly performance-oriented polystore solution.…”
Multi-model, federated and polystore architectures allow for querying data from different sources through a unified interface, providing interoperability for databases. However, support for blockchain-based databases remains scarce. MOON is a middleware designed to enable cross-model querying of data in relational and blockchain databases through standard SQL syntax. This paper aims to promote the interoperability of blockchain-based and relational database systems through a new approach, called Inter-MOON. Through experimentation, Inter-MOON was found to offer near-total support for SQL DML query syntax, be up to 10x faster than MOON, and show comparable performance to similar tools.
“…It appears that BigDAWG [38] is the first polystore system in the literature; nevertheless, it is not an isolated case and the work on polystores is going on. For example, [39] reports about Polystore++, a polystore system for analytic applications. However, the definition of polystore systems requires the development of techniques for building query plans (as in [40]), as well as to define extensions of the classical relational algebra (as in [41]).…”
Internet technology and mobile technology have enabled producing and diffusing massive data sets concerning almost every aspect of day-by-day life. Remarkable examples are social media and apps for volunteered information production, as well as Open Data portals on which public administrations publish authoritative and (often) geo-referenced data sets. In this context, JSON has become the most popular standard for representing and exchanging possibly geo-referenced data sets over the Internet.Analysts, wishing to manage, integrate and cross-analyze such data sets, need a framework that allows them to access possibly remote storage systems for JSON data sets, to retrieve and query data sets by means of a unique query language (independent of the specific storage technology), by exploiting possibly-remote computational resources (such as cloud servers), comfortably working on their PC in their office, more or less unaware of real location of resources. In this paper, we present the current state of the J-CO Framework, a platform-independent and analyst-oriented software framework to manipulate and cross-analyze possibly geo-tagged JSON data sets. The paper presents the general approach behind the J-CO Framework, by illustrating the query language by means of a simple, yet non-trivial, example of geographical cross-analysis. The paper also presents the novel features introduced by the re-engineered version of the execution engine and the most recent components, i.e., the storage service for large single JSON documents and the user interface that allows analysts to comfortably share data sets and computational resources with other analysts possibly working in different places of the Earth globe. Finally, the paper reports the results of an experimental campaign, which show that the execution engine actually performs in a more than satisfactory way, proving that our framework can be actually used by analysts to process JSON data sets.
Multistores are data management systems that enable query processing across different and heterogeneous databases; besides the distribution of data, complexity factors like schema heterogeneity and data replication must be resolved through integration and data fusion activities. Our multistore solution relies on a dataspace to provide the user with an integrated view of the available data and enables the formulation and execution of GPSJ queries. In this paper, we propose a technique to optimize the execution of GPSJ queries by formulating and evaluating different execution plans on the multistore. In particular, we outline different strategies to carry out joins and data fusion by relying on different schema representations; then, a self-learning black-box cost model is used to estimate execution times and select the most efficient plan. The experiments assess the effectiveness of the cost model in choosing the best execution plan for the given queries and exploit multiple multistore benchmarks to investigate the factors that influence the performance of different plans.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.