G.R. Brown scite author profile

With the emergence of accelerator devices such as multicores, graphics-processing units (GPUs), and field-programmable gate arrays (FPGAs), application designers are confronted with the problem of searching a huge design space that has been shown to have widely varying performance and energy metrics for different accelerators, different application domains, and different use cases. To address this problem, numerous studies have evaluated specific applications across different accelerators. In this paper, we analyze an important domain of applications, referred to as sliding-window applications, when executing on FPGAs, GPUs, and multicores. For each device, we present optimization strategies and analyze use cases where each device is most effective. The results show that FPGAs can achieve speedup of up to 11x and 57x compared to GPUs and multicores, respectively, while also using orders of magnitude less energy.

show abstract

A performance and energy comparison of convolution on GPUs, FPGAs, and multicore processors

Fowers

Brown

Wernsing

et al. 2013

ACM Trans. Archit. Code Optim.

View full text Add to dashboard Cite

Recent architectural trends have focused on increased parallelism via multicore processors and increased heterogeneity via accelerator devices (e.g., graphics-processing units, field-programmable gate arrays). Although these architectures have significant performance and energy potential, application designers face many device-specific challenges when choosing an appropriate accelerator or when customizing an algorithm for an accelerator. To help address this problem, in this article we thoroughly evaluate convolution, one of the most common operations in digital-signal processing, on multicores, graphics-processing units, and field-programmable gate arrays. Whereas many previous application studies evaluate a specific usage of an application, this article assists designers with design space exploration for numerous use cases by analyzing effects of different input sizes, different algorithms, and different devices, while also determining Pareto-optimal trade-offs between performance and energy.

show abstract

A Tradeoff Analysis of FPGAs, GPUs, and Multicores for Sliding-Window Applications

Cooke

Fowers

Brown

et al. 2015

ACM Trans. Reconfigurable Technol. Syst.

View full text Add to dashboard Cite

The increasing usage of hardware accelerators such as Field-Programmable Gate Arrays (FPGAs) and Graphics Processing Units (GPUs) has significantly increased application design complexity. Such complexity results from a larger design space created by numerous combinations of accelerators, algorithms, and hw/sw partitions. Exploration of this increased design space is critical due to widely varying performance and energy consumption for each accelerator when used for different application domains and different use cases. To address this problem, numerous studies have evaluated specific applications across different architectures.In this article, we analyze an important domain of applications, referred to as sliding-window applications, implemented on FPGAs, GPUs, and multicore CPUs. For each device, we present optimization strategies and analyze use cases where each device is most effective. The results show that, for large input sizes, FPGAs can achieve speedups of up to 5.6× and 58× compared to GPUs and multicore CPUs, respectively, while also using up to an order of magnitude less energy. For small input sizes and applications with frequency-domain algorithms, GPUs generally provide the best performance and energy. . 2015. A tradeoff analysis of FPGAs, GPUs, and multicores for sliding-window applications.

show abstract

The CMS Modular Track Finder boards, MTF6 and MTF7

et al. 2013

View full text Add to dashboard Cite

To accommodate the increase in energy and luminosity of the upgraded LHC, the CMS Endcap Muon Level 1 Trigger system has to be significantly modified. To provide the best track reconstruction, the Trigger system must now import all available trigger primitives generated by Cathode Strip Chambers and by other regional subsystems, such as Resistive Plate Chambers. In addition to massive input bandwidth, this also requires a significant increase in logic and memory resources.To satisfy these requirements, a new Sector Processor unit for muon track finding is being designed. This unit follows the micro-TCA standard recently adopted by CMS. It consists of three modules. The Core Logic module houses the large FPGA that contains the processing logic and multi-gigabit serial links for data exchange. The Optical module contains optical receivers and transmitters; it communicates with the Core Logic module via a custom backplane section. The Look-Up Table module contains a large amount of low-latency memory that is used to assign the final transverse momentum of the muon candidate tracks. The name of the unit -Modular Track Finder -reflects the modular approach used in the design.Presented here are the details of the hardware design of the prototype unit based on Xilinx's Virtex-6 FPGA family, MTF6, as well as results of the conducted tests. Also presented are plans for the pre-production prototype based on the Virtex-7 FPGA family, MTF7. KEYWORDS: Trigger concepts and systems (hardware and software); Large detector systems for particle and astroparticle physics; Particle tracking detectors (Gaseous detectors)

show abstract

Radiation hardened PowerPC 603e/sup TM/ based single board computer

Brown

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

G.R. Brown

A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications

A performance and energy comparison of convolution on GPUs, FPGAs, and multicore processors

A Tradeoff Analysis of FPGAs, GPUs, and Multicores for Sliding-Window Applications

The CMS Modular Track Finder boards, MTF6 and MTF7

Radiation hardened PowerPC 603e/sup TM/ based single board computer

Contact Info

Product

Resources

About