“…It stores only the input facts (input relations) and the conflict sets, and does not store partial pattern matches. Another improvement of Rete is the LEAPS algorithm [31], which aims to provide better space-time complexity. Rete itself has many improved versions (e.g.…”
Section: Incremental Approachesmentioning
confidence: 99%
“…Some authors of this paper designed and implemented a distributed Rete-based incremental graph query engine, IncQuery-D [31], which relies on the optimizer of EMF-IncQuery and is able to scale for large models.…”
Keywords graph queries, relational algebra, query optimization
IntroductionThe key components of Big Data are often defined as variety, velocity and volume [28] of data. Applications operating on continuously changing graphs are a prime example: the semi-structured graph-like nature introduces a high variety, changes happen at high velocity, and datasets are often high-volume. Such applications include fraud detection in financial transactions [27], validation of engineering models [3], and static analysis of source code repositories [35]. These use cases provide a set of complex queries that need to be evaluated continuously on each change of the underlying graph.Traditional approaches need to reevaluate each query upon each change, which often takes minutes on a large dataset. In contrast, incremental query evaluation caches interim results, hence it only requires reevaluation on a small fragment of the dataset impacted by the change. This leads to significant speedup for large and continuously changing data. Although several approaches exist for incremental query evaluation [9,20] in the context of expert systems, incremental query evaluation is not in widespread use in graph databases.In order to predict query performance at runtime, relational databases synthesize and evaluate different query plans which impose a certain ordering on relational algebraic operations prescribed by the query. Optimizing query plans is a challenging task, since a wide variety of query plans may exist even for simple queries with different costs. Database engines use heuristics-based optimization techniques and evaluate a cost function for the different query plans [10].Query plans have been adapted for graph query engines using a local-search based query evaluation strategy where it is called the search plan. Optimization techniques may exploit the type and multiplicity information defined in the graph schema (or metamodel) [29,22] or rely upon runtime statistics of the instance graph [11,38,39].In case of incremental graph query engines, the structure and the content of caches have the most significant impact on query performance. Therefore, optimization is directed to reduce execution time and memory consumption imposed by a complex network of caches [37].
“…It stores only the input facts (input relations) and the conflict sets, and does not store partial pattern matches. Another improvement of Rete is the LEAPS algorithm [31], which aims to provide better space-time complexity. Rete itself has many improved versions (e.g.…”
Section: Incremental Approachesmentioning
confidence: 99%
“…Some authors of this paper designed and implemented a distributed Rete-based incremental graph query engine, IncQuery-D [31], which relies on the optimizer of EMF-IncQuery and is able to scale for large models.…”
Keywords graph queries, relational algebra, query optimization
IntroductionThe key components of Big Data are often defined as variety, velocity and volume [28] of data. Applications operating on continuously changing graphs are a prime example: the semi-structured graph-like nature introduces a high variety, changes happen at high velocity, and datasets are often high-volume. Such applications include fraud detection in financial transactions [27], validation of engineering models [3], and static analysis of source code repositories [35]. These use cases provide a set of complex queries that need to be evaluated continuously on each change of the underlying graph.Traditional approaches need to reevaluate each query upon each change, which often takes minutes on a large dataset. In contrast, incremental query evaluation caches interim results, hence it only requires reevaluation on a small fragment of the dataset impacted by the change. This leads to significant speedup for large and continuously changing data. Although several approaches exist for incremental query evaluation [9,20] in the context of expert systems, incremental query evaluation is not in widespread use in graph databases.In order to predict query performance at runtime, relational databases synthesize and evaluate different query plans which impose a certain ordering on relational algebraic operations prescribed by the query. Optimizing query plans is a challenging task, since a wide variety of query plans may exist even for simple queries with different costs. Database engines use heuristics-based optimization techniques and evaluate a cost function for the different query plans [10].Query plans have been adapted for graph query engines using a local-search based query evaluation strategy where it is called the search plan. Optimization techniques may exploit the type and multiplicity information defined in the graph schema (or metamodel) [29,22] or rely upon runtime statistics of the instance graph [11,38,39].In case of incremental graph query engines, the structure and the content of caches have the most significant impact on query performance. Therefore, optimization is directed to reduce execution time and memory consumption imposed by a complex network of caches [37].
“…The IncQuery-D [38] system is an incremental graph query engine, built on top of the components of the Viatra Query framework [43] (later known as EMF-IncQuery [41]). IncQuery-D reused the query parser and compiler of EMF-IncQuery, but used a different query engine, tailored for scalable distributed query evaluation and operating on RDF data sets.…”
Section: Query Compilation In Graph Transformation Systemsmentioning
Abstract. Graph database systems are increasingly adapted for storing and processing heterogeneous network-like datasets. Many challenging applications with near real-time requirements-such as financial fraud detection, on-the-fly model validation and root cause analysis-can be formalised as graph problems and tackled with graph databases efficiently. However, as no standard graph query language has yet emerged, users are subjected to the possibility of vendor lock-in. The openCypher group aims to define an open specification for a declarative graph query language. However, creating an openCypher-compatible query engine requires significant research and engineering efforts. Meanwhile, model-driven language workbenches support the creation of domainspecific languages by providing high-level tools to create parsers, editors and compilers. In this paper, we present an approach to build a compiler and optimizer for openCypher using model-driven technologies, which allows developers to define declarative optimization rules.
“…However, none of this work addresses distribution or asynchronicity. To address the scalability of queries, Szárnyas et al, [46] present an adaption of incremental graph search techniques, like EMF-IncQuery. They propose an architecture for distributed and incremental queries.…”
Abstract-The models@run.time paradigm promotes the use of models during the execution of cyber-physical systems to represent their context and to reason about their runtime behaviour. However, current modeling techniques do not allow to cope at the same time with the large-scale, distributed, and constantly changing nature of these systems. In this paper, we introduce a distributed models@run.time approach, combining ideas from reactive programming, peer-to-peer distribution, and large-scale models@run.time. We define distributed models as observable streams of chunks that are exchanged between nodes in a peerto-peer manner. A lazy loading strategy allows to transparently access the complete virtual model from every node, although chunks are actually distributed across nodes. Observers and automatic reloading of chunks enable a reactive programming style. We integrated our approach into the Kevoree Modeling Framework and demonstrate that it enables frequently changing, reactive distributed models that can scale to millions of elements and several thousand nodes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.