Memory-efficient hash joins

Barber, Ronald; Lohman, Guy M.; Pandis, Ippokratis; Raman, V.; Sidle, Richard; Attaluri, Gopi; Chainani, Naresh; Lightstone, Sam; Sharpe, Dave

doi:10.14778/2735496.2735499

Cited by 67 publications

(35 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Hash table lookup throughput is the main bottleneck of the join operation, and its performance strictly depends on the number of dependent memory accesses (i.e., number of pointers chased) required to locate an item. A lookup in the hash table can result in an arbitrary number of memory accesses as state-of-the-art hash tables offer a tradeoff between performance (i.e., number of chained memory accesses) and space efficiency [4,6,7]. Moreover, when the build relation keys follow a skewed value distribution, hash collisions are unavoidable as some build keys are identical but carry different payloads.…”

Section: Hash Tablesmentioning

confidence: 99%

Asynchronous memory access chaining

Kocberber¹,

Falsafi²,

Grot

2015

Proc. VLDB Endow.

View full text Add to dashboard Cite

In-memory databases rely on pointer-intensive data structures to quickly locate data in memory. A single lookup operation in such data structures often exhibits long-latency memory stalls due to dependent pointer dereferences. Hiding the memory latency by launching additional memory accesses for other lookups is an effective way of improving performance of pointer-chasing codes (e.g., hash table probes, tree traversals). The ability to exploit such inter-lookup parallelism is beyond the reach of modern out-of-order cores due to the limited size of their instruction window. Instead, recent work has proposed software prefetching techniques that exploit inter-lookup parallelism by arranging a set of independent lookups into a group or a pipeline, and navigate their respective pointer chains in a synchronized fashion. While these techniques work well for highly regular access patterns, they break down in the face of irregularity across lookups. Such irregularity includes variable-length pointer chains, early exit, and read/write dependencies.This work introduces Asynchronous Memory Access Chaining (AMAC), a new approach for exploiting interlookup parallelism to hide the memory access latency. AMAC achieves high dynamism in dealing with irregularity across lookups by maintaining the state of each lookup separately from that of other lookups. This feature enables AMAC to initiate a new lookup as soon as any of the in-flight lookups complete. In contrast, the static arrangement of lookups into a group or pipeline in existing techniques precludes such adaptivity. Our results show that AMAC matches or outperforms state-of-the-art prefetching techniques on regular access patterns, while delivering up to 2.3x higher performance under irregular data structure lookups. AMAC fully utilizes the available microarchitectural resources, generating the maximum number of memory accesses allowed by hardware in both single-and multi-threaded execution modes.

show abstract

Section: Hash Tablesmentioning

confidence: 99%

Asynchronous memory access chaining

Kocberber¹,

Falsafi²,

Grot

2015

Proc. VLDB Endow.

View full text Add to dashboard Cite

show abstract

“…The discovered solution applies a hash-join bloom filter in the HSJOIN (#2). A bloom filter is a space-efficient, probabilistic data structure to test whether an element is a member of a set by hashing the values and performing a bit comparison between them [3]. False positives can occur; however, false negatives never occur.…”

Section: Learning Enginementioning

confidence: 99%

“…As the acronym suggests, the languages is able to retrieve data stored in the RDF format. 3 A SPARQL query consists of a set of triple patterns similar to RDF triples. In the query, each of the subject, predicate, and object may be a variable.…”

Section: Matching Enginementioning

confidence: 99%

Guided automated learning for query workload re-optimization

et al. 2019

View full text Add to dashboard Cite

Query optimization is a hallmark of database systems When a SQL query runs more expensively than is viable or warranted, determination of the performance issues is usually performed manually in consultation with experts through the analysis of query's execution plan (QEP). However, this is an excessively time consuming, human error-prone, and costly process. GALO is a novel system that automates this process. The tool automatically learns recurring problem patterns in query plans over workloads in an offline learning phase, to build a knowledge base of plan-rewrite remedies. It then uses the knowledge base online to re-optimize queries often quite drastically.GALO's knowledge base is built on RDF and SPARQL, W3C graph database standards, which is well suited for manipulating and querying over SQL query plans, which are graphs themselves. GALO acts as a third-tier of reoptimization, after query rewrite and cost-based optimization, as a query plan rewrite. For generality, the context of knowledge base problem patterns, including table and column names, is abstracted with canonical symbol labels. Since the knowledge base is not tied to the context of supplied QEPs, table and column names are matched automatically during the re-optimization phase. Thus, problem patterns learned over a particular query workload can be applied in other query workloads. GALO's knowledge base is also an invaluable tool for database experts to debug query performance issues by tracking to known issues and solutions as well as refining the optimizer with new tuned techniques by the development team. We demonstrate an experimental study of the effectiveness of our techniques over synthetic TPC-DS and real IBM client query workloads.

show abstract

“…Wildfire also uses non-partitioned hash joins and Concise Hash Tables, as described in [5] . In addition to column scans, hash joins, and inserts, Wildfire has support for many other evaluators, such as hash-based group by, predicate evaluation, expression evaluation, and updates of inmemory (non-persistent) indexes.…”

Section: Wildfire Engine: Storage and Processingmentioning

confidence: 99%

Wildfire

Barber

Huras

Lohman

et al. 2016

Proceedings of the 2016 International Conference on Management of Data

View full text Add to dashboard Cite

We demonstrate Hybrid Transactional and Analytics Processing (HTAP) on the Spark platform by the Wildfire prototype, which can ingest up to ≈6 million inserts per second per node and simultaneously perform complex SQL analytics queries. Here, a simplified mobile application uses Wildfire to recommend advertising to mobile customers based upon their distance from stores and their interest in products sold by these stores, while continuously graphing analytics results as those customers move and respond to the ads with purchases.

show abstract

Memory-efficient hash joins

Cited by 67 publications

References 20 publications

Asynchronous memory access chaining

Asynchronous memory access chaining

Guided automated learning for query workload re-optimization

Wildfire

Contact Info

Product

Resources

About