Explicit and implicit information needs of people with depression: a qualitative investigation of problems reported on an online depression support forum

Deep neural networks (DNNs) have attracted significant attention for their excellent accuracy especially in areas such as computer vision and artificial intelligence. To enhance their performance, technologies for their hardware acceleration are being studied. FPGA technology is a promising choice for hardware acceleration, given its low power consumption and high flexibility which makes it suitable particularly for embedded systems. However, complex DNN models may need more computing and memory resources than those available in many current FPGAs. This paper presents FP-BNN, a Binarized Neural Network (BNN) for FPGAs, which drastically cuts down the hardware consumption while maintaining acceptable accuracy. We introduce a Resource-Aware Model Analysis (RAMA) method, and remove the bottleneck involving multipliers by bit-level XNOR and shifting operations, and the bottleneck of parameter access by data quantization and optimized on-chip storage. We evaluate the FP-BNN accelerator designs for MNIST multi-layer perceptrons (MLP), Cifar-10 ConvNet, and AlexNet on a Stratix-V FPGA system. An inference performance of Tera opartions per second with acceptable accuracy loss is obtained, which shows improvement in speed and energy efficiency over other computing platforms.

show abstract

Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns

Yin

Ouyang

et al. 2017

IEEE Trans. VLSI Syst.

253

113

View full text Add to dashboard Cite

A High Energy Efficient Reconfigurable Hybrid Neural Network Processor for Deep Learning Applications

Yin

Ouyang

Tang

et al. 2018

IEEE J. Solid-State Circuits

176

View full text Add to dashboard Cite

A Survey of Coarse-Grained Reconfigurable Architecture and Design

et al. 2019

View full text Add to dashboard Cite

As general-purpose processors have hit the power wall and chip fabrication cost escalates alarmingly, coarsegrained reconfigurable architectures (CGRAs) are attracting increasing interest from both academia and industry, because they offer the performance and energy efficiency of hardware with the flexibility of software. However, CGRAs are not yet mature in terms of programmability, productivity, and adaptability. This article reviews the architecture and design of CGRAs thoroughly for the purpose of exploiting their full potential. First, a novel multidimensional taxonomy is proposed. Second, major challenges and the corresponding state-of-the-art techniques are surveyed and analyzed. Finally, the future development is discussed. CCS Concepts: • Computer systems organization → Reconfigurable computing; • Hardware → Reconfigurable logic and FPGAs; • Theory of computation → Models of computation;

show abstract

A 141 UW, 2.46 PJ/Neuron Binarized Convolutional Neural Network Based Self-Learning Speech Recognition Processor in 28NM CMOS

Yin

Ouyang

Zheng

et al. 2018

View full text Add to dashboard Cite

Algorithm and Architecture of a Low-Complexity and High-Parallelism Preprocessing-Based K -Best Detector for Large-Scale MIMO Systems

Peng

Liu

Zhou

et al. 2018

IEEE Trans. Signal Process.

View full text Add to dashboard Cite

Polyhedral model based mapping optimization of loop nests for CGRAs

Liu

Yin

Liu

et al. 2013

View full text Add to dashboard Cite

The coarse-grained reconfigurable architecture (CGRA) is a promising platform that provides both high performance and high power-efficiency. The compute-intensive portions of an application (e.g. loops) are often mapped onto CGRA for acceleration. To optimize the mapping of loop nests to CGRA, this paper makes two contributions: i) Establishing a precise CGRA performance model and formulating the loop nests mapping as a nonlinear optimization problem based on polyhedral model, ii) Extracting an efficient heuristic loop transformation and mapping algorithm (PolyMAP) to improve mapping performance. Experiment results on most kernels of the PolyBench and real-life applications show that our proposed approach can improve the performance of the kernels by 21% on average, as compared to one of the best existing mapping algorithm, EPIMap. The runtime complexity of PolyMAP is also acceptable.

show abstract

A Multilevel Cell STT-MRAM-Based Computing In-Memory Accelerator for Binary Convolutional Neural Network

et al. 2018

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shaojun Wei

FP-BNN: Binarized neural network on FPGA

Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns

A High Energy Efficient Reconfigurable Hybrid Neural Network Processor for Deep Learning Applications

A Survey of Coarse-Grained Reconfigurable Architecture and Design

A 141 UW, 2.46 PJ/Neuron Binarized Convolutional Neural Network Based Self-Learning Speech Recognition Processor in 28NM CMOS

Algorithm and Architecture of a Low-Complexity and High-Parallelism Preprocessing-Based K -Best Detector for Large-Scale MIMO Systems

Polyhedral model based mapping optimization of loop nests for CGRAs

A Multilevel Cell STT-MRAM-Based Computing In-Memory Accelerator for Binary Convolutional Neural Network

Contact Info

Product

Resources

About