Clemens Lutz scite author profile

Modern Stream Processing Engines (SPEs) process large data volumes under tight latency constraints. Many SPEs execute processing pipelines using message passing on shared-nothing architectures and apply a partition-based scale-out strategy to handle high-velocity input streams. Furthermore, many state-of-the-art SPEs rely on a Java Virtual Machine to achieve platform independence and speed up system development by abstracting from the underlying hardware. In this paper, we show that taking the underlying hardware into account is essential to exploit modern hardware efficiently. To this end, we conduct an extensive experimental analysis of current SPEs and SPE design alternatives optimized for modern hardware. Our analysis highlights potential bottlenecks and reveals that state-of-the-art SPEs are not capable of fully exploiting current and emerging hardware trends, such as multi-core processors and high-speed networks. Based on our analysis, we describe a set of design changes to the common architecture of SPEs to scale-up on modern hardware. We show that the single-node throughput can be increased by up to two orders of magnitude compared to state-of-the-art SPEs by applying specialized code generation, fusing operators, batch-style parallelization strategies, and optimized windowing. This speedup allows for deploying typical streaming applications on a single or a few nodes instead of large clusters.

show abstract

Pump Up the Volume

Lutz

Breß

Zeuch

et al. 2020

View full text Add to dashboard Cite

An Energy-Efficient Stream Join for the Internet of Things

Michalke

Grulich

Lutz

et al. 2021

View full text Add to dashboard Cite

Efficient k-means on GPUs

Lutz

Breß

Rabl

et al. 2018

View full text Add to dashboard Cite

RStore: A Direct-Access DRAM-based Data Store

Trivedi

Stuedi

Metzler

et al. 2015

View full text Add to dashboard Cite

Distributed DRAM stores have become an attractive option for providing fast data accesses to analytics applications. To accelerate the performance of these stores, researchers have proposed using RDMA technology. RDMA offers high bandwidth and low latency data access by carefully separating resource setup from IO operations, and making IO operations fast by using rich network semantics and offloading. Despite recent interest, leveraging the full potential of RDMA in a distributed environment remains a challenging task. In this paper, we present RDMA Store or RStore, a DRAMbased data store that delivers high performance by extending RDMA's separation philosophy to a distributed setting. RStore achieves high aggregate bandwidth (705 Gb/s) and close-tohardware latency on our 12-machine testbed. We developed a distributed graph processing framework and a Key-Value sorter using RStore's unique memory-like API. The graph processing framework, which relies on RStore for low-latency graph access, outperforms state-of-the-art systems by margins of 2.6−4.2× when calculating PageRank. The Key-Value sorter can sort 256 GB of data in 31.7 sec, which is 8× better than Hadoop TeraSort in a similar setting.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Clemens Lutz

Analyzing efficient stream processing on modern hardware

Pump Up the Volume

An Energy-Efficient Stream Join for the Internet of Things

Efficient k-means on GPUs

RStore: A Direct-Access DRAM-based Data Store

Contact Info

Product

Resources

About