2016 IEEE 27th International Conference on Application-Specific Systems, Architectures and Processors (ASAP) 2016
DOI: 10.1109/asap.2016.7760779
|View full text |Cite
|
Sign up to set email alerts
|

F-CNN: An FPGA-based framework for training Convolutional Neural Networks

Abstract: Abstract-This paper presents a novel reconfigurable framework for training Convolutional Neural Networks (CNNs). The proposed framework is based on reconfiguring a streaming datapath at runtime to cover the training cycle for the various layers in a CNN. The streaming datapath can support various parameterized modules which can be customized to produce implementations with different trade-offs in performance and resource usage. The modules follow the same input and output data layout, simplifying configuration… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
14
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 43 publications
(14 citation statements)
references
References 16 publications
0
14
0
Order By: Relevance
“…Moreover, TABLE 6 summarizes the comparison with the state-of-the-art ANN training accelerators for MNIST classification [40][41][42]. Small networks are selected for ANNs to present a better comparison with our work.…”
Section: F Fractional Precisionmentioning
confidence: 99%
“…Moreover, TABLE 6 summarizes the comparison with the state-of-the-art ANN training accelerators for MNIST classification [40][41][42]. Small networks are selected for ANNs to present a better comparison with our work.…”
Section: F Fractional Precisionmentioning
confidence: 99%
“…Other companies such as Microsoft [30,31] and Amazon's "AWS EC2 F1" instance followed suit in using FPGA clusters within their data centres and servers for back-end training and inference at a lower power cost-highlighting the trend for low-power solutions utilising FPGAs. CNN training on FPGA platforms has not been investigated thoroughly with only two exceptions that focus on batch training which uses FPGA platforms as replacements for GPUs clusters in offline training [28,32]. In [27] Wenlai et al presented F-CNN, the first CPU/FPGA hybrid design for deploying and training CNN networks.…”
Section: Related Workmentioning
confidence: 99%
“…This requires the introduction of significant resource overheads since it does not fully consider the overlap in calculations within the forward pass. In [32] Venkataramanaiah et al extends work from [28] and introduces a hardware CNN training RTL compiler. Their work is purely FPGA and relies on static processing element arrays for convolutional calculations.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, a 102-convolutional-layer CNN model, which contains 42.4 M parameters, costs 14 ms to classify a 224 × 224 × 3 scene image while a simple 4-convolutional-layer CNN model costs 8.77 ms and only contains 1 M parameters, as detailed in Section 3.1 of this paper. This is an unacceptable cost of time and storage space in special situations, such as embedded devices [52][53][54] or during on-orbit processing [55]. In contrast, a small and shallow model is fast and uses little space, but will not yield accurate and precise results when trained directly on ground truth data [33].…”
Section: Introductionmentioning
confidence: 99%