Abstract-The constant growth of data and its importance to drive Machine Learning and Big Data is pushing storage systems towards ever increasing I/O bandwidth and lower latency requirements. In recent years, the Non Volatile Memory Express (NVMe) standard has enabled SSD drives to deliver high I/O rates by allowing the storage to be connected directly via the fastest available interconnect to the processing chip. In parallel, the adoption of FPGAs in data centers is creating opportunities to accelerate various applications and/or Operating System (OS) operations. While, FPGAs in data centers have been connected via PCIe to mostly x86 servers, we have now also available heterogeneous SoCs with multi-cores and FPGAs integrated on the same die and connected by an onchip interconnect.In this paper, we present how to rethink and accelerate NVMe performance on heterogeneous SoC with integrated FPGAs providing a first research insight on the performance benefits of such an approach. We provide an analysis of the Linux block I/O layer and showcase the relationship between the system's performance and its I/O implementation. Consequently, we introduce an FPGA-based fast path which accelerates the access to the NVMe drive. Our comparative evaluation demonstrates that our FPGA-based FastPath achieves up to 71% lower latency and up to 5x higher I/O performance against the baseline system on a Zynq development board.
In recent years, heterogeneous computing has emerged as the vital way to increase computers' performance and energy efficiency by combining diverse hardware devices, such as Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs). The rationale behind this trend is that different parts of an application can be offloaded from the main CPU to diverse devices, which can efficiently execute these parts as co-processors. FPGAs are a subset of the most widely used co-processors, typically used for accelerating specific workloads due to their flexible hardware and energy-efficient characteristics. These characteristics have made them prevalent in a broad spectrum of computing systems ranging from low-power embedded systems to high-end data centers and cloud infrastructures.However, these hardware characteristics come at the cost of programmability. Developers who create their applications using high-level programming languages (e.g., Java, Python, etc.) are required to familiarize with a hardware description language (e.g., VHDL, Verilog) or recently heterogeneous programming models (e.g., OpenCL, HLS) in order to exploit the co-processors' capacity and tune the performance of their applications. Currently, the above-mentioned heterogeneous programming models support exclusively the compilation from compiled languages, such as C and C++. Thus, the transparent integration of heterogeneous co-processors to the software ecosystem of managed programming languages (e.g. Java, Python) is not seamless.In this paper we rethink the engineering trade-offs that we encountered, in terms of transparency and compilation overheads, while integrating FPGAs into high-level managed programming languages. We present a novel approach that enables runtime code specialization techniques for seamless and high-performance execution of Java programs on FPGAs. The proposed solution is prototyped in the context of the Java programming language and TornadoVM; an open-source programming framework for Java execution on heterogeneous hardware. Finally, we evaluate the proposed solution for FPGA execution against both sequential and multithreaded Java implementations showcasing up to 224× and 19.8× performance speedups, respectively, and up to 13.82× compared to TornadoVM running on an Intel integrated GPU. We also provide a break-down analysis of the proposed compiler optimizations for FPGA execution, as a means to project their impact on the applications' characteristics. ACM CCS 2012Software and its engineering → Runtime environments;
Graph mining is an important research area within the domain of data mining. One of the most challenging tasks of graph mining is frequent subgraph mining. This work presents the first FPGA-based implementation, to the best of our knowledge, of the most efficient and well-known algorithm for the Frequent Subgraph Mining (FSM) problem, i.e. gSpan. The proposed system, named High Performance Computing-gSpan (HPC-gSpan), achieves manyfold speedup vs. the official software solution of the gboost library when executed on a high-end CPU for various real-world datasets.
The constant growth of data and its importance to drive Machine Learning and Big Data is pushing storage systems towards ever increasing I/O bandwidth and lower latency requirements. In recent years, the Non Volatile Memory Express (NVMe) standard has enabled SSD drives to deliver high I/O rates by allowing the storage to be connected directly via the fastest available interconnect to the processing chip. In parallel, the adoption of FPGAs in data centers is creating opportunities to accelerate various applications and/or Operating System (OS) operations. While, FPGAs in data centers have been connected via PCIe to mostly x86 servers, we have now also available heterogeneous SoCs with multi-cores and FPGAs integrated on the same die and connected by an onchip interconnect. In this paper, we present how to rethink and accelerate NVMe performance on heterogeneous SoC with integrated FPGAs providing a first research insight on the performance benefits of such an approach. We provide an analysis of the Linux block I/O layer and showcase the relationship between the system's performance and its I/O implementation. Consequently, we introduce an FPGA-based fast path which accelerates the access to the NVMe drive. Our comparative evaluation demonstrates that our FPGA-based FastPath achieves up to 71% lower latency and up to 5x higher I/O performance against the baseline system on a Zynq development board.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.