Tom Van Court scite author profile

Higher global bandwidth requirement for many applications and lower network cost have motivated the use of the Dragonfly network topology for high performance computing systems. In this paper we present the architecture of the Cray Cascade system, a distributed memory system based on the Dragonfly [1] network topology. We describe the structure of the system, its Dragonfly network and the routing algorithms. We describe a set of advanced features supporting both mainstream high performance computing applications and emerging global address space programing models.We present a combination of performance results from prototype systems and simulation data for large systems. We demonstrate the value of the Dragonfly topology and the benefits obtained through extensive use of adaptive routing.

show abstract

Families of FPGA-based accelerators for approximate string matching

Court

Herbordt

2007

Microprocessors and Microsystems

View full text Add to dashboard Cite

Dynamic programming for approximate string matching is a large family of different algorithms, which vary significantly in purpose, complexity, and hardware utilization. Many implementations have reported impressive speed-ups, but have typically been point solutions -highly specialized and addressing only one or a few of the many possible options. The problem to be solved is creating a hardware description that implements a broad range of behavioral options without losing efficiency due to feature bloat. We report a set of three component types that address different parts of the approximate string matching problem. This allows each application to choose the feature set required, then make maximum use of the FPGA fabric according to that application's specific resource requirements. Multiple, interchangeable implementations are available for each component type. We show that these methods allow the efficient generation of a large, if not complete, family of accelerators for this application. This flexibility was obtained while retaining high performance: We have evaluated a sample against serial reference codes and found speed-ups of from 150× to 400× over a high-end PC.

show abstract

FPGA Acceleration of Rigid Molecule Interactions

Court

Herbordt

2004

View full text Add to dashboard Cite

Modeling of molecule interactions often uses two or more successive models of increasing complexity. Rigid models based on correlation techniques are common as early screening passes-to detect interactions worth costlier examination-and are often at the heart of later passes as well. Even these rigid models are time-consuming when applied to large models at 10 3 − 10 5 different three-axis rotations. This paper presents an FPGA structure for performing the correlations efficiently by using a systolic array for 3-D correlation and arithmetic tailored to the application. The system includes a novel addressing technique for performing a three-axis rotation of a 3-D voxel model using modest amounts of logic and nearly no cost in time or buffer space. We compare our FPGA implementation with one on a PC using the standard transform-based method and find a speed-up of a factor of 200. We present extensions for handling implementation technologies with different performance characteristics and for handling models too large to fit on-chip.

show abstract

FPGA Acceleration of Rigid Molecule Interactions

Court

Herbordt

View full text Add to dashboard Cite

Abstract. Modeling of molecule interactions often uses two or more successive models of increasing complexity. Rigid models based on correlation techniques are common as early screening passes-to detect interactions worth costlier examination-and are often at the heart of later passes as well. Even these rigid models are time-consuming when applied to large models at 10 3 − 10 5 different three-axis rotations. This paper presents an FPGA structure for performing the correlations efficiently by using a systolic array for 3-D correlation and arithmetic tailored to the application. The system includes a novel addressing technique for performing a three-axis rotation of a 3-D voxel model using modest amounts of logic and nearly no cost in time or buffer space. We compare our FPGA implementation with one on a PC using the standard transform-based method and find a speed-up of a factor of 200. We present extensions for handling implementation technologies with different performance characteristics and for handling models too large to fit on-chip.

show abstract

Processing Repetitive Sequence Structures with Mismatches at Streaming Rate

Conti

Court

Herbordt

2004

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tom Van Court

Cray Cascade: A scalable HPC system based on a Dragonfly network

Families of FPGA-based accelerators for approximate string matching

FPGA Acceleration of Rigid Molecule Interactions

FPGA Acceleration of Rigid Molecule Interactions

Processing Repetitive Sequence Structures with Mismatches at Streaming Rate

Contact Info

Product

Resources

About