Bangsen Tian scite author profile

Many architects believe that major improvements in cost-energyperformance must now come from domain-specific hardware. This paper evaluates a custom ASIC-called a Tensor Processing Unit (TPU)-deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed on-chip memory. The TPU's deterministic execution model is a better match to the 99th-percentile responsetime requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs that help average throughput more than guaranteed latency. The lack of such features helps explain why, despite having myriad MACs and a big memory, the TPU is relatively small and low power. We compare the TPU to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters. Our workload, written in the high-level TensorFlow framework, uses production NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters' NN inference demand. Despite low utilization for some applications, the TPU is on average about 15X -30X faster than its contemporary GPU or CPU, with TOPS/Watt about 30X -80X higher. Moreover, using the GPU's GDDR5 memory in the TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and 200X the CPU.

show abstract

In-Datacenter Performance Analysis of a Tensor Processing Unit

Jouppi

et al. 2017

View full text Add to dashboard Cite

Many architects believe that major improvements in cost-energyperformance must now come from domain-specific hardware. This paper evaluates a custom ASIC-called a Tensor Processing Unit (TPU)-deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed on-chip memory. The TPU's deterministic execution model is a better match to the 99th-percentile responsetime requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs that help average throughput more than guaranteed latency. The lack of such features helps explain why, despite having myriad MACs and a big memory, the TPU is relatively small and low power. We compare the TPU to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters. Our workload, written in the high-level TensorFlow framework, uses production NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters' NN inference demand. Despite low utilization for some applications, the TPU is on average about 15X-30X faster than its contemporary GPU or CPU, with TOPS/Watt about 30X-80X higher. Moreover, using the GPU's GDDR5 memory in the TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and 200X the CPU.

show abstract

Fixed-Time Consensus Tracking for Multiagent Systems With High-Order Integrator Dynamics

Zuo

Tian

Defoort

et al. 2018

IEEE Trans. Automat. Contr.

521

234

View full text Add to dashboard Cite

Abstract-This paper addresses the fixed-time leader-follower consensus problem for high-order integrator multi-agent systems subject to matched external disturbances. A new cascade control structure, based on a fixedtime distributed observer, is developed to achieve the fixed-time consensus tracking control. A simulation example is included to show the efficacy and the performance of the proposed control structure with respect to different initial conditions.

show abstract

A fixed-time output feedback control scheme for double integrator systems

et al. 2017

View full text Add to dashboard Cite

A continuous output feedback control scheme rendering the closed-loop double integrator system globally stable in finite-time is presented. In particular, the convergence time is independent of initial conditions. The bi-limit homogeneous technique is used for controller and observer designs with fixed-time convergence. Then, a continuous output feedback control law is proposed for nominal double-integrator system and its perturbed version. The homogeneity and Lyapunov techniques are used to ensure the fixed-time stability of the closed-loop system under output feedback control framework. Finally, the efficiency of the proposed algorithms are illustrated by numerical simulations.

show abstract

Optimal guidance for reentry vehicles based on indirect Legendre pseudospectral method

2011

View full text Add to dashboard Cite

Finite-Time Reentry Attitude Control Based on Adaptive Multivariable Disturbance Compensation

Tian

Yin

Wang

2015

IEEE Trans. Ind. Electron.

138

View full text Add to dashboard Cite

The finite time tracking control for Reusable Launch Vehicle (RLV) with unmatched disturbance is investigated. An adaptive multivariabl disturbance compensation (AMDC) scheme is proposed to provide the estimation for external disturbances where the bounds of the perturbations are not known. Based on the estimation, a continuous multivariable homogeneity second order sliding mode controller is designed to ensure that the attitude tracking is achieved in finite time. A proof of the finite time convergence of the closed-loop system under the integrated controller and disturbance observer is derived using the Lyapunov technique. The features of the proposed control scheme is that it does not require any information on the bounds of the disturbance and its gradient except for their existence. At the same time, the finite time convergence, nominal performance recovery and chattering alleviation are guaranteed. Finally, some simulation tests are provided to demonstrate the effectiveness of the proposed control scheme.Index Terms-Adaptive multivariable disturbance observer, finite time convergence, Reusable Launch Vehicle.

show abstract

Real-Time Trajectory and Attitude Coordination Control for Reusable Launch Vehicle in Reentry Phase

Tian

Fan

et al. 2015

IEEE Trans. Ind. Electron.

144

View full text Add to dashboard Cite

Extraction of Glacial Lake Outlines in Tibet Plateau Using Landsat 8 Imagery and Google Earth Engine

Chen

Zhang

Tian

et al. 2017

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Bangsen Tian

In-Datacenter Performance Analysis of a Tensor Processing Unit

In-Datacenter Performance Analysis of a Tensor Processing Unit

Fixed-Time Consensus Tracking for Multiagent Systems With High-Order Integrator Dynamics

A fixed-time output feedback control scheme for double integrator systems

Optimal guidance for reentry vehicles based on indirect Legendre pseudospectral method

Finite-Time Reentry Attitude Control Based on Adaptive Multivariable Disturbance Compensation

Real-Time Trajectory and Attitude Coordination Control for Reusable Launch Vehicle in Reentry Phase

Extraction of Glacial Lake Outlines in Tibet Plateau Using Landsat 8 Imagery and Google Earth Engine

Contact Info

Product

Resources

About