Ningyi Xu scite author profile

The advancement of object detection algorithms makes them widely used in autonomous systems. However, due to high computational complexity of Convolutional Neural Networks(CNN), stringent latency requirement is hard to meet for real-time object detection. To address this problem, a lowlatency accelerator architecture is proposed in this paper. A finegrained column-based pipeline architecture with padding skip technique is implemented to reduce the start-up time of pipeline. In order to cut down the computational time of CNN, double signed-multiplication correcting circuit is introduced. In addition, pooling unit with share buffer is proposed to reduce storage cost for pooling layer. To demonstrate our new architecture, we implement the YOLOv2-tiny deep neural network (youonly-look-once) with input size 1280x384 on ZC706 development board, improving the latency by 2.125× to 2.34× compared to previous FPGA accelerator for YOLOv2-tiny.

show abstract

Efficient Compression Methods for Wire-Spread-Based Stochastic Computing Deep Neural Networks

Wang

et al. 2022

IEEE Trans. Circuits Syst. II

View full text Add to dashboard Cite

Correction: A Hybrid CPU-GPU Accelerated Framework for Fast Mapping of High-Resolution Human Brain Connectome

Wang¹,

Du²,

Xia³

et al. 2013

PLoS ONE

View full text Add to dashboard Cite

Recently, a combination of non-invasive neuroimaging techniques and graph theoretical approaches has provided a unique opportunity for understanding the patterns of the structural and functional connectivity of the human brain (referred to as the human brain connectome). Currently, there is a very large amount of brain imaging data that have been collected, and there are very high requirements for the computational capabilities that are used in high-resolution connectome research. In this paper, we propose a hybrid CPU-GPU framework to accelerate the computation of the human brain connectome. We applied this framework to a publicly available resting-state functional MRI dataset from 197 participants. For each subject, we first computed Pearson's Correlation coefficient between any pairs of the time series of gray-matter voxels, and then we constructed unweighted undirected brain networks with 58 k nodes and a sparsity range from 0.02% to 0.17%. Next, graphic properties of the functional brain networks were quantified, analyzed and compared with those of 15 corresponding random networks. With our proposed accelerating framework, the above process for each network cost 80,150 minutes, depending on the network sparsity. Further analyses revealed that high-resolution functional brain networks have efficient small-world properties, significant modular structure, a power law degree distribution and highly connected nodes in the medial frontal and parietal cortical regions. These results are largely compatible with previous human brain network studies. Taken together, our proposed framework can substantially enhance the applicability and efficacy of high-resolution (voxel-based) brain network analysis, and have the potential to accelerate the mapping of the human brain connectome in normal and disease states.

show abstract

The Colored Concept Map and Its Application in Learning Assistance Program

Zhao

Hui-ling

et al. 2012

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ningyi Xu

Low-Complexity Precision-Scalable Multiply-Accumulate Unit Architectures for Deep Neural Network Accelerators

A Low-Latency FPGA Implementation for Real-Time Object Detection

Efficient Compression Methods for Wire-Spread-Based Stochastic Computing Deep Neural Networks

Correction: A Hybrid CPU-GPU Accelerated Framework for Fast Mapping of High-Resolution Human Brain Connectome

The Colored Concept Map and Its Application in Learning Assistance Program

Contact Info

Product

Resources

About