Deep neural networks (DNNs) have been playing an increasingly important role in a wide range of areas such as computer vision and voice recognition. While training and validation become gradually feasible with high-end general purpose processors such as graphical processor units (GPU), high throughput inferences in embedded hardware platforms with low hardware resources and power consumption efficiency are still challenging. Binarized neural networks (BNNs) are emerging as a promising method to overcome these challenges by reducing bit widths of DNN data representations with many optimal prior solutions. However, accuracy degradation compared to the same architecture with full precision is a considerable problem of the BNN, while the binary neural networks still contain significant redundancy for optimization. In this paper, to address the limitations, we implement a streaming accelerator architecture with three optimization techniques: pipelining-unrolling for streaming each layer, weight reuse for parallel computation, and MAC (multiplication-accumulation) compression. Our method first constructs streaming architecture by pipelining-unrolling method to maximize throughput. Next, the weight reuse method with the K-mean cluster is applied to reduce the complexity of the popcount operation. Finally, MAC compression reduces hardware resources used for remaining computation on MAC operations. The implemented hardware accelerator integrated into a state-of-the-art field programable gate array (FPGA) provides the maximum performance of the classification at 1531k frames per second with 98.4% accuracy for the MNIST dataset and 205K frame per second with 80.2% accuracy for the Cifar-10 dataset. Besides, the proposed design's ratio FPS/LUTs is approximately 57 (MNIST) and 0.707 (Cifar-10), which is much lower than the state-of-the-art design with a comparable throughput and inference accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.