2019
DOI: 10.1109/tcad.2017.2785257
|View full text |Cite
|
Sign up to set email alerts
|

Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

3
276
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
4
2

Relationship

0
10

Authors

Journals

citations
Cited by 271 publications
(280 citation statements)
references
References 30 publications
3
276
0
1
Order By: Relevance
“…NeuFlow by Farabet et al [46] is one of the first works that tackled the problem of using FPGAs for DL, in particular, for vision systems. Caffeine by Zhang et al [201] is a hardware and software co-designed library to support CNNs on FPGAs. On the hardware side, it provides a high-level synthesis implementation of an FPGA accelerator for CNNs.…”
Section: Infrastructurementioning
confidence: 99%
“…NeuFlow by Farabet et al [46] is one of the first works that tackled the problem of using FPGAs for DL, in particular, for vision systems. Caffeine by Zhang et al [201] is a hardware and software co-designed library to support CNNs on FPGAs. On the hardware side, it provides a high-level synthesis implementation of an FPGA accelerator for CNNs.…”
Section: Infrastructurementioning
confidence: 99%
“…DNNBuilder [16] and FP-DNN [28] propose end-to-end tools that can automatically generate optimized FPGA-based accelerators from high-level DNN symbolic descriptions in Caffe/Tensorflow frameworks. Caffeine [27] is another automation tool that provides guidelines for choosing FPGA hardware parameters, such as the number of processing elements (PEs), bit precision of variables, and parallel data factors. By using these automation tools, it is easier to bridge the gap between fast DNN construction in popular machine learning frameworks and slow implementation of targeted hardware accelerators.…”
Section: Background and Related Workmentioning
confidence: 99%
“…[17] and [18] used an OpenGL-designed architecture to accelerate AlexNet and VGG on Arria 10. A reusable CNN engine with a unified framework and a scalable PE array was proposed in [19], which provided an end-to-end solution for deploying CNN models from Caffe onto an FPGA. The motivation matches well with the gap between deep learning researchers and hardware, but there is still space to improve the performance and resource utilization.…”
Section: Related Workmentioning
confidence: 99%