2016
DOI: 10.3390/electronics5040088
|View full text |Cite
|
Sign up to set email alerts
|

GPGPU Accelerated Deep Object Classification on a Heterogeneous Mobile Platform

Abstract: Abstract:Deep convolutional neural networks achieve state-of-the-art performance in image classification. The computational and memory requirements of such networks are however huge, and that is an issue on embedded devices due to their constraints. Most of this complexity derives from the convolutional layers and in particular from the matrix multiplications they entail. This paper proposes a complete approach to image classification providing common layers used in neural networks. Namely, the proposed approa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
14
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 9 publications
(14 citation statements)
references
References 19 publications
0
14
0
Order By: Relevance
“…The transformation step can be performed by copying the tiles of pixels from original image/filter to the matrices in a specific order. This is an operation where the sequence has to be considered and can be performed efficiently using the CPU [15]. Once transformed, these matrices can be multiplied efficiently using concurrent resources of GPU.…”
Section: Heterogeneous and Gpu-only Convmm Layersmentioning
confidence: 99%
See 4 more Smart Citations
“…The transformation step can be performed by copying the tiles of pixels from original image/filter to the matrices in a specific order. This is an operation where the sequence has to be considered and can be performed efficiently using the CPU [15]. Once transformed, these matrices can be multiplied efficiently using concurrent resources of GPU.…”
Section: Heterogeneous and Gpu-only Convmm Layersmentioning
confidence: 99%
“…Once transformed, these matrices can be multiplied efficiently using concurrent resources of GPU. Using the heterogeneous resources (CPU-GPU) of a system, an algorithm like this can benefit in terms of execution time [15,21,22]. However, extra memory transfers caused by the heterogeneous implementation can also break the overall performance.…”
Section: Heterogeneous and Gpu-only Convmm Layersmentioning
confidence: 99%
See 3 more Smart Citations