2017
DOI: 10.1145/3140659.3080246
|View full text |Cite
|
Sign up to set email alerts
|

In-Datacenter Performance Analysis of a Tensor Processing Unit

Abstract: Many architects believe that major improvements in cost-energyperformance must now come from domain-specific hardware. This paper evaluates a custom ASIC-called a Tensor Processing Unit (TPU)-deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed on-chip memory. The TPU's deterministic execution model is a bette… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
1,209
0
5

Year Published

2017
2017
2023
2023

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 1,113 publications
(1,218 citation statements)
references
References 41 publications
(37 reference statements)
4
1,209
0
5
Order By: Relevance
“…In addition, 28MB of software-managed on-chip memory is included to store the intermediate results and the inputs of the Matrix Multiply Unit. The datapath occupies 67% of the TPU floorplan, while the area occupied for control is only 2% [17]. This contrasts with the state-of-theart server CPUs and GPUs, in which the control structures occupy significant chip area and lead to increased power consumption.…”
Section: Google's Tensor Processing Unitmentioning
confidence: 99%
See 1 more Smart Citation
“…In addition, 28MB of software-managed on-chip memory is included to store the intermediate results and the inputs of the Matrix Multiply Unit. The datapath occupies 67% of the TPU floorplan, while the area occupied for control is only 2% [17]. This contrasts with the state-of-theart server CPUs and GPUs, in which the control structures occupy significant chip area and lead to increased power consumption.…”
Section: Google's Tensor Processing Unitmentioning
confidence: 99%
“…Based on a projection that voice-based search will significantly increase the computational demands of Google's datacenters, a custom ASIC chip -called Tensor Processing Unit (TPU) --was designed and deployed by Google in 2015 [17]. TPU is aimed at accelerating the inference phase of different types of neural network applications, including multi-layer perceptrons (MLP), convolutional neural networks (CNN), and recurrent neural networks (RNN) [18].…”
Section: Google's Tensor Processing Unitmentioning
confidence: 99%
“…Mesh topology is a strikingly popular way to organize PEs, for example, Google's TPU [12], the DianNao family [15], MIT's Eyeriss [16] (see Fig. 6).…”
Section: I S P a T I A L D A T A F L O W A R C H I T E C T U R Ementioning
confidence: 99%
“…이를 극복하기 위해 합성곱(convolution) 처리를 위한 특별한 프로세서 연구개발이 진행 중이다 [4]. 그 가운데 구글의 TPU(Tensor Processing Unit)에서는 특정 기능만을 수행하여 연 산 속도를 개선시킨 것에 대한 연구결과를 발표했다 [5] [6]. 이에 본 논문에서 합성곱 및 pooling 연 산에서 곱셈과 덧셈을 빠르게 계산 할 수 있고 병렬처리가 가능한 ALU 연산기를 제안한다.…”
Section: 서론 기계학습 분야에서 Cnn 알고리즘은 이미지 인식 및 분류에 있어서 높은 인식률을 자랑한다unclassified