Chao Wang scite author profile

As the emerging field of machine learning, deep learning shows excellent ability in solving complex learning problems. However, the size of the networks becomes increasingly large scale due to the demands of the practical applications, which poses significant challenge to construct a high performance implementations of deep learning neural networks. In order to improve the performance as well to maintain the low power cost, in this paper we design DLAU, which is a scalable accelerator architecture for large-scale deep learning networks using FPGA as the hardware prototype. The DLAU accelerator employs three pipelined processing units to improve the throughput and utilizes tile techniques to explore locality for deep learning applications. Experimental results on the state-of-the-art Xilinx FPGA board demonstrate that the DLAU accelerator is able to achieve up to 36.1x speedup comparing to the Intel Core2 processors, with the power consumption at 234mW.

show abstract

MALOC: A Fully Pipelined FPGA Accelerator for Convolutional Neural Networks With All Layers Mapped on Chip

Gong

Wang

et al. 2018

IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.

View full text Add to dashboard Cite

Accelerating the Next Generation Long Read Mapping with the FPGA-Based System

Chen

Wang

et al. 2014

IEEE/ACM Trans. Comput. Biol. and Bioinf.

View full text Add to dashboard Cite

To compare the newly determined sequences against the subject sequences stored in the databases is a critical job in the bioinformatics. Fortunately, recent survey reports that the state-of-the-art aligners are already fast enough to handle the ultra amount of short sequence reads in the reasonable time. However, for aligning the long sequence reads (>400 bp) generated by the next generation sequencing (NGS) technology, it is still quite inefficient with present aligners. Furthermore, the challenge becomes more and more serious as the lengths and the amounts of the sequence reads are both keeping increasing with the improvement of the sequencing technology. Thus, it is extremely urgent for the researchers to enhance the performance of the long read alignment. In this paper, we propose a novel FPGA-based system to improve the efficiency of the long read mapping. Compared to the state-of-the-art long read aligner BWA-SW, our accelerating platform could achieve a high performance with almost the same sensitivity. Experiments demonstrate that, for reads with lengths ranging from 512 up to 4,096 base pairs, the described system obtains a 10x -48x speedup for the bottleneck of the software. As to the whole mapping procedure, the FPGA-based platform could achieve a 1.8x -3:3x speedup versus the BWA-SW aligner, reducing the alignment cycles from weeks to days.

show abstract

A Deep Learning Prediction Process Accelerator Based FPGA

Wang

et al. 2015

View full text Add to dashboard Cite

Recently, machine learning is widely used in applications and cloud services. And as the emerging field of machine learning, deep learning shows excellent ability in solving complex learning problems. To give users better experience, high performance implementations of deep learning applications seem very important. As a common means to accelerate algorithms, FPGA has high performance, low power consumption, small size and other characteristics. So we use FPGA to design a deep learning accelerator, the accelerator focuses on the implementation of the prediction process, data access optimization and pipeline structure. Compared with Core 2 CPU 2.3GHz, our accelerator can achieve promising result.

show abstract

An Overview of FPGA Based Deep Learning Accelerators: Challenges and Opportunities

Wang

Zhou

2019

View full text Add to dashboard Cite

Toward high-performance all-solid-state supercapacitors using facilely fabricated graphite nanosheet-supported CoMoS4 as electrode material

Wei

Wang

Yao

et al. 2019

Chemical Engineering Journal

View full text Add to dashboard Cite

A Power-Efficient Accelerator Based on FPGAs for LSTM Network

Zhang

Wang

Gong

et al. 2017

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Chao Wang

Cambricon-S: Addressing Irregularity in Sparse Neural Networks through A Cooperative Software/Hardware Approach

DLAU: A Scalable Deep Learning Accelerator Unit on FPGA

MALOC: A Fully Pipelined FPGA Accelerator for Convolutional Neural Networks With All Layers Mapped on Chip

Accelerating the Next Generation Long Read Mapping with the FPGA-Based System

A Deep Learning Prediction Process Accelerator Based FPGA

An Overview of FPGA Based Deep Learning Accelerators: Challenges and Opportunities

Toward high-performance all-solid-state supercapacitors using facilely fabricated graphite nanosheet-supported CoMoS4 as electrode material

A Power-Efficient Accelerator Based on FPGAs for LSTM Network

Contact Info

Product

Resources

About