2019
DOI: 10.48550/arxiv.1903.06630
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TinBiNN: Tiny Binarized Neural Network Overlay in about 5,000 4-LUTs and 5mW

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 0 publications
0
4
0
Order By: Relevance
“…Some acceleration approaches include using mobile GPUs [1], customdesigned application-specific integrated circuits (ASICs) [25][26][27][28], as well as FPGAs [2]. The FPGA acceleration of embedded neural network acceleration is of special interest since it can bring together the power performance benefits of dedicated circuits and the capability to deploy microarchitectures optimized for the target neural network model [29][30][31][32][33][34]. FPGA neural network accelerators demonstrated orders of magnitude higher power efficiency compared to general purpose processing units when applied to complex networks, such as image or video recognition [35,36].…”
Section: Embedded Neural Network Accelerationmentioning
confidence: 99%
“…Some acceleration approaches include using mobile GPUs [1], customdesigned application-specific integrated circuits (ASICs) [25][26][27][28], as well as FPGAs [2]. The FPGA acceleration of embedded neural network acceleration is of special interest since it can bring together the power performance benefits of dedicated circuits and the capability to deploy microarchitectures optimized for the target neural network model [29][30][31][32][33][34]. FPGA neural network accelerators demonstrated orders of magnitude higher power efficiency compared to general purpose processing units when applied to complex networks, such as image or video recognition [35,36].…”
Section: Embedded Neural Network Accelerationmentioning
confidence: 99%
“…For 8-bits data retrieval, a custom extension based on the ISA RV32IMC and named LVE is defined in [14] and applied in [15]. It contains instructions for 16 and 8-bits data allocation as vector array in the scratchpad (SP) memory.…”
Section: A Optimize Dram Usagementioning
confidence: 99%
“…Almost all the implementations apply parallel processing one way or another. The RISC-V factor is evident in the custom extension for SIMD to manage parallel instructions, e. g. [15], [34], [29]. [35] introduces parallel processing for two RISC-V processors, one in a simplified setting for general purpose activity, and the second a more capable processor for the CNN execution.…”
Section: E Cpu-accelerator Configurationmentioning
confidence: 99%
“…Finally, low-cost FPGAs enable professional practice. The iCE40UP5K powering the UPduino has 5280 logic elements, enough to implement interesting designs such as ARM or RISC-V processors [11], neural networks [12], audio synthesizers, or arcade games. Students develop their design with Verilog or VHDL (or the high-level HDL of your choice, such as Chisel, SpinalHDL, myHDL, etc.)…”
Section: Low-cost Fpgasmentioning
confidence: 99%