Dean De Leo scite author profile

2021

Proc. VLDB Endow.

We present Teseo, a new system for the storage and analysis of dynamic structural graphs in main-memory and the addition of transactional support. Teseo introduces a novel design based on sparse arrays, large arrays interleaved with gaps, and a fat tree, where the graph is ultimately stored. Our design contrasts with early systems for the analysis of dynamic graphs, which often lack transactional support and are anchored to a vertex table as a primary index. We claim that the vertex table implies several constraints, often neglected, that can actually impair the generality, the robustness and extension opportunities of these systems. We compare Teseo with other dynamic graph systems, showing a high resilience to workload and input changes, while achieving comparable, if not superior, throughputs in updates and latencies in raw scans.

Packed Memory Arrays - Rewired

2019

The physical memory layout of a tree-based index structure deteriorates over time as it sustains more updates; such that sequential scans on the physical level become nonsequential, and therefore slower. Packed Memory Arrays (PMAs) prevent this by managing all data in a sequential sparse array. PMAs have been studied mostly theoretically but suffer from practical problems, as we show in this paper. We study and fix these problems, resulting in an improved data structure: the Rewired Memory Array (RMA). We compare RMA with the main previous PMA implementations as well as state-of-the-art tree index structures and show on a wide variety of data and query distributions that RMA can reach competitive update and point lookup performance, while always providing superior scan performance -close to dense column scans.

Fast Concurrent Reads and Updates with PMAs

2019

Fast navigation through graphs with O(1) cost relies on compact storage of graphs in dense arrays, but is not efficiently updatable. In this paper we propose storage of updatable graphs in Packed Memory Arrays (PMAs), and tackle the problem of supporting concurrent updates and reads. So far, there has been no work on concurrently updating PMAs. We propose two novel techniques to perform concurrent scans and updates in the data structure and evaluate our implementation against other existing alternatives, showing that PMAs can in some cases be on par with data structures optimised for writes, while providing at least one order of magnitude higher throughput for reads.

Extending SQL for Computing Shortest Paths

2017

Reachability and shortest paths are among two of the most common queries realized on graphs. While graph frameworks and property graph databases provide an extensive and convenient built-in support for these operations, it is still both clunky and ine cient to perform on standard SQL DBMSs. In this paper, we present an extension to the standard SQL language to compute both reachability predicates and many-to-many shortest path queries. We rst describe a methodology to represent a directed graph starting from virtual table expressions. Second, we introduce a new type of operator to compute shortest paths on the given graph. Our semantic abides by the rules of operating with table expressions, ensuring that the property of the closure from the relational algebra is retained. Finally, we developed a prototype implementation of our extension on top of MonetDB, an existing open source relational DBMS. Our preliminary results still show that dynamically building our representation of the underlying graph overly dominates the query time. Currently, this cost can only be amortized when executing multiple shortest paths on the same graph.

Errata for "teseo and the analysis of structural dynamic graphs"

Fuchs²,

2021

Proc. VLDB Endow.

In our paper [4], we experimentally evaluated our work, Teseo, together with five other systems under the LDBC Graphalytics benchmark [6]. We developed and publicly released [2] an ad-hoc driver for the purpose. Since the time the paper was published, a bug [1] in the driver has been found. Due to this bug, we discovered that the completion times for Graphalytics have been incorrectly measured by 1 second or slightly more than their actual values. This issue involves the results of Table 2, Figure 8 and Table 3 reported in our paper [4]. Still, because the bug equally affected all systems evaluated and it is only related to the measurements, most of the comparisons and the general conclusions in the paper still hold.