2019
DOI: 10.1007/978-3-030-31756-0_3
|View full text |Cite
|
Sign up to set email alerts
|

Scaling Analysis of Specialized Tensor Processing Architectures for Deep Learning Models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 23 publications
(3 citation statements)
references
References 39 publications
0
3
0
Order By: Relevance
“…It should be noted that these results were obtained without detriment to the accuracy and loss for the relatively simple DNN and small images (28x28). The current investigations of network size impact and image size impact are under work now and their results will be published elsewhere [15]. In addition to Google TPU architecture, the specific tensor processing hardware tools are available in the other modern GPUcards like Tesla V100 and Titan V by NVIDIA based on the Volta microarchitecture with specialized Tensor Cores Units (640 TCU) and their influence on training and prediction speedup are under investigation and will be reported elsewhere [15].…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…It should be noted that these results were obtained without detriment to the accuracy and loss for the relatively simple DNN and small images (28x28). The current investigations of network size impact and image size impact are under work now and their results will be published elsewhere [15]. In addition to Google TPU architecture, the specific tensor processing hardware tools are available in the other modern GPUcards like Tesla V100 and Titan V by NVIDIA based on the Volta microarchitecture with specialized Tensor Cores Units (640 TCU) and their influence on training and prediction speedup are under investigation and will be reported elsewhere [15].…”
Section: Discussionmentioning
confidence: 99%
“…As far as the model size limits the available memory space for the batch of images other techniques could be useful for squeezing the model size, like quantization [15][16] and pruning [17][18]. These results can be used for selection of the optimal parameters for applications where a large batch of data needs to be processed, for example, for monitoring real-time road condition in advanced driver assistance systems (ADAS), where such specialized architectures like TPU, TCU, and FPGA-based solutions can be deployed [19].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation