Due to its potential to greatly accelerate a wide variety of applications, reconfigurable computing has become a subject of a great deal of research. Its key feature is the ability to perform computations in hardware to increase performance, while retaining much of the flexibility of a software solution. In this survey, we explore the hardware aspects of reconfigurable computing machines, from single chip architectures to multi-chip systems, including internal structures and external coupling. We also focus on the software that targets these machines, such as compilation tools that map high-level algorithms directly to the reconfigurable substrate. Finally, we consider the issues involved in run-time reconfigurable systems, which reuse the configurable hardware during program execution.
Datacenter workloads demand high computational capabilities, flexibility, power efficiency, and low cost. It is challenging to improve all of these factors simultaneously. To advance datacenter capabilities beyond what commodity server designs can provide, we have designed and built a composable, reconfigurable fabric to accelerate portions of large-scale software services. Each instantiation of the fabric consists of a 6x8 2-D torus of high-end Stratix V FPGAs embedded into a half-rack of 48 machines. One FPGA is placed into each server, accessible through PCIe, and wired directly to other FPGAs with pairs of 10 Gb SAS cables.In this paper, we describe a medium-scale deployment of this fabric on a bed of 1,632 servers, and measure its efficacy in accelerating the Bing web search engine. We describe the requirements and architecture of the system, detail the critical engineering challenges and solutions needed to make the system robust in the presence of failures, and measure the performance, power, and resilience of the system when ranking candidate documents. Under high load, the largescale reconfigurable fabric improves the ranking throughput of each server by a factor of 95% for a fixed latency distributionor, while maintaining equivalent throughput, reduces the tail latency by 29%.
The ATLAS IBL CollaborationDuring the shutdown of the CERN Large Hadron Collider in 2013-2014, an additional pixel layer was installed between the existing Pixel detector of the ATLAS experiment and a new, smaller radius beam pipe. The motivation for this new pixel layer, the Insertable B-Layer (IBL), was to maintain or improve the robustness and performance of the ATLAS tracking system, given the higher instantaneous and integrated luminosities realised following the shutdown. Because of the extreme radiation and collision rate environment, several new radiation-tolerant sensor and electronic technologies were utilised for this layer. This paper reports on the IBL construction and integration prior to its operation in the ATLAS detector.The ATLAS [1] general purpose detector is used for the study of proton-proton (pp) and heavy-ion collisions at the CERN Large Hadron Collider (LHC) [2]. It successfully collected data at pp collision energies of 7 and 8 TeV in the period of 2010-2012, known as Run 1. Following an LHC shutdown in 2013-2014 (LS1), it has collected data since 2015 at a pp collision energy of 13 TeV (the so-called Run 2).The ATLAS inner tracking detector (ID) [1, 3] provides charged particle tracking with high efficiency in the pseudorapidity 1 range of |η| < 2.5. With increasing radial distance from the interaction region, it consists of silicon pixel and micro-strip detectors, followed by a transition radiation tracker (TRT) detector, all surrounded by a superconducting solenoid providing a 2 T magnetic field.The original ATLAS pixel detector [4,5], referred to in this paper as the Pixel detector, was the innermost part of the ID during Run 1. It consists of three barrel layers (named the B-Layer, Layer 1 and Layer 2 with increasing radius) and three disks on each side of the interaction region, to guarantee at least three space points over the full tracking |η| range. It was designed to operate for the Phase-I period of the LHC, that is with a peak luminosity of 1 × 10 34 cm −2 s −1 and an integrated luminosity of approximately 340 fb −1 corresponding to a TID of up to 50 MRad 2 and a fluence of up to 1 × 10 15 n eq /cm 2 NIEL. However, for luminosities exceeding 2 × 10 34 cm −2 s −1 , which are now expected during the Phase-I operation, the read-out efficiency of the Pixel layers will deteriorate. This paper describes the construction and surface integration of an additional pixel layer, the Insertable B-Layer (IBL) [6], installed during the LS1 shutdown between the B-Layer and a new smaller radius beam pipe. The main motivations of the IBL were to maintain the full ID tracking performance and robustness during Phase-I operation, despite read-out bandwidth limitations of the Pixel layers (in particular the B-Layer) at the expected Phase-I peak luminosity, and accumulated radiation damage to the silicon sensors and front-end electronics. The IBL is designed to operate until the end of Phase-I, when a full tracker upgrade is planned [7] for high luminosity LHC (HL-LHC) operation from approximately ...
Datacenter workloads demand high computational capabilities, flexibility, power efficiency, and low cost. It is challenging to improve all of these factors simultaneously. To advance datacenter capabilities beyond what commodity server designs can provide, we designed and built a composable, reconfigurable hardware fabric based on field programmable gate arrays (FPGA). Each server in the fabric contains one FPGA, and all FPGAs within a 48-server rack are interconnected over a low-latency, high-bandwidth network. We describe a medium-scale deployment of this fabric on a bed of 1632 servers, and measure its effectiveness in accelerating the ranking component of the Bing web search engine. We describe the requirements and architecture of the system, detail the critical engineering challenges and solutions needed to make the system robust in the presence of failures, and measure the performance, power, and resilience of the system. Under high load, the large-scale reconfigurable fabric improves the ranking throughput of each server by 95% at a desirable latency distribution or reduces tail latency by 29% at a fixed throughput. In other words, the reconfigurable fabric enables the same throughput using only half the number of servers.
Asynchronous design has been an active area of research since at least the mid 1950's, but has yet to achieve widespread use. We examine the benefits and problems inherent in asynchronous computations, and in some of the more notable design methodologies. These include Huffman asynchronous circuits, burst-mode circuits, micropipelines, template-based and trace theory-based delay-insensitive circuits, signal transition graphs, change diagrams, and compilation-based quasidelay-insensitive circuits.
FPGA-based reprogrammable systems are revolutionizing some forms of computation and digital logic. As a logic emulation system they provide orders of magnitude speedup over software simulation. As a custom-computing machine they achieve the highest performance implementation for many types of applications. As a multi-mode system they yield significant hardware savings and provide truly generic hardware. In this paper we discuss the promise and problems of reprogrammable systems. This includes an overview of the chip and system architectures of reprogrammable systems, as well as the applications of these systems. We also discuss the challenges and opportunities of future reprogrammable systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.