Large-scale protein sequence comparison is an important but compute-intensive task in molecular biology. The popular BLASTP software for this task has become a bottleneck for proteomic database search. One third of this software's time is spent executing the Smith-Waterman dynamic programming algorithm. This work describes a novel FPGA design for banded Smith-Waterman, an algorithmic variant tuned to the needs of BLASTP. This design has been implemented in Mercury BLASTP, our FPGA-accelerated version of the BLASTP algorithm. We show that Mercury BLASTP runs 6-16 times faster than software BLASTP on a modern CPU while delivering 99% identical results.
Large-scale protein sequence comparison is an important but compute-intensive task in molecular biology. BLASTP is the most popular tool for comparative analysis of protein sequences. In recent years, an exponential increase in the size of protein sequence databases has required either exponentially more running time or a cluster of machines to keep pace. To address this problem, we have designed and built a high-performance FPGA-accelerated version of BLASTP, Mercury BLASTP. In this paper, we describe the architecture of the portions of the application that are accelerated in the FPGA, and we also describe the integration of these FPGA-accelerated portions with the existing BLASTP software. We have implemented Mercury BLASTP on a commodity workstation with two Xilinx Virtex-II 6000 FPGAs. We show that the new design runs 11-15 times faster than software BLASTP on a modern CPU while delivering close to 99% identical results.
Abstract-In this paper we describe the SVP, a Softcore Vector Processor targeted toward Computational Biology and streaming applications. The SVP is a software programmable architecture constructed from predefined hardware building blocks. We leverage the flexibility and power of an FPGA to enhance a streaming vector processor design. Each functional unit includes an instruction controller, parallel processing elements with shared registers, and a memory unit which provides access to both local and streaming data. We target the Smith-Waterman sequence alignment algorithm and report estimated performance numbers based on a Virtex-4 FPGA implementation. I. INTRODUCTIONThe Softcore Vector Processor was developed in response to the need for accelerating a wide range of computational biology applications. The design philosophy of the SVP is based on two important goals: adaptability and high performance.The SVP is designed to be a software programmable and hardware customizable platform. Instruction based execution allows support for a large number of algorithms in computational biology. However, the fact that different classes of applications require different subsets of hardware resources argues for a customizable hardware design built from primitives. The second goal of the project is to achieve programmability without sacrificing performance. The SVP was designed to perform competitively with full custom solutions available today.Currently available genome databases are growing exponentially in size [1], making it difficult for software analysis tools to keep up. A number of hardware solutions utilizing special purpose VLSI or reconfigurable hardware such as Field Programmable Gate Arrays (FPGAs) have been proposed to bridge this gap. Systems such as the BISP, and the Kestrel [2], built on custom asics support high performance designs but are expensive to manufacture and cannot be adapted for applications that require differing hardware resources. FPGA solutions tend to be specialized, with support for new applications requiring a laborious re-design of the hardware.The SVP is a vector processor targetting algorithms that can be easily mapped to a systolic array of processing elements (PEs). Programmable instructions execute on PEs that operate on different units of data achieving fine grained parallelism. Sequence analysis tools commonly use the streaming paradigm, where a large genomic database is streamed through
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.