Many architects believe that major improvements in cost-energyperformance must now come from domain-specific hardware. This paper evaluates a custom ASIC-called a Tensor Processing Unit (TPU)-deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed on-chip memory. The TPU's deterministic execution model is a better match to the 99th-percentile responsetime requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs that help average throughput more than guaranteed latency. The lack of such features helps explain why, despite having myriad MACs and a big memory, the TPU is relatively small and low power. We compare the TPU to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters. Our workload, written in the high-level TensorFlow framework, uses production NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters' NN inference demand. Despite low utilization for some applications, the TPU is on average about 15X -30X faster than its contemporary GPU or CPU, with TOPS/Watt about 30X -80X higher. Moreover, using the GPU's GDDR5 memory in the TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and 200X the CPU.
Many architects believe that major improvements in cost-energyperformance must now come from domain-specific hardware. This paper evaluates a custom ASIC-called a Tensor Processing Unit (TPU)-deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed on-chip memory. The TPU's deterministic execution model is a better match to the 99th-percentile responsetime requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs that help average throughput more than guaranteed latency. The lack of such features helps explain why, despite having myriad MACs and a big memory, the TPU is relatively small and low power. We compare the TPU to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters. Our workload, written in the high-level TensorFlow framework, uses production NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters' NN inference demand. Despite low utilization for some applications, the TPU is on average about 15X-30X faster than its contemporary GPU or CPU, with TOPS/Watt about 30X-80X higher. Moreover, using the GPU's GDDR5 memory in the TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and 200X the CPU.
Abstract-This paper addresses the fixed-time leader-follower consensus problem for high-order integrator multi-agent systems subject to matched external disturbances. A new cascade control structure, based on a fixedtime distributed observer, is developed to achieve the fixed-time consensus tracking control. A simulation example is included to show the efficacy and the performance of the proposed control structure with respect to different initial conditions.
A continuous output feedback control scheme rendering the closed-loop double integrator system globally stable in finite-time is presented. In particular, the convergence time is independent of initial conditions. The bi-limit homogeneous technique is used for controller and observer designs with fixed-time convergence. Then, a continuous output feedback control law is proposed for nominal double-integrator system and its perturbed version. The homogeneity and Lyapunov techniques are used to ensure the fixed-time stability of the closed-loop system under output feedback control framework. Finally, the efficiency of the proposed algorithms are illustrated by numerical simulations.
The finite time tracking control for Reusable Launch Vehicle (RLV) with unmatched disturbance is investigated. An adaptive multivariabl disturbance compensation (AMDC) scheme is proposed to provide the estimation for external disturbances where the bounds of the perturbations are not known. Based on the estimation, a continuous multivariable homogeneity second order sliding mode controller is designed to ensure that the attitude tracking is achieved in finite time. A proof of the finite time convergence of the closed-loop system under the integrated controller and disturbance observer is derived using the Lyapunov technique. The features of the proposed control scheme is that it does not require any information on the bounds of the disturbance and its gradient except for their existence. At the same time, the finite time convergence, nominal performance recovery and chattering alleviation are guaranteed. Finally, some simulation tests are provided to demonstrate the effectiveness of the proposed control scheme.Index Terms-Adaptive multivariable disturbance observer, finite time convergence, Reusable Launch Vehicle.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.