Recently, the posit numerical format has shown promise for DNN data representation and compute with ultralow precision ([5..8]-bit). However, majority of studies focus only on DNN inference. In this work, we propose DNN training using posits and compare with the floating point training. We evaluate on both MNIST and Fashion MNIST corpuses, where 16-bit posits outperform 16-bit floating point for end-to-end DNN training.Index Terms-Deep neural networks, low-precision arithmetic, posit numerical format I. INTRODUCTIONThe edge computing, offers a decentralized solution to cloud-based datacenters [1] and intelligence-at-the-edge of mobile networks. However, training on the edge is a challenge for many deep neural networks (DNNs). This arises due to the significant cost of multiply-and-accumulate (MAC) units, an ubiquitous operation in all DNNs. In a 45 nm CMOS process, energy consumption doubles from 16-bit floats to 32-bit floats for addition and by ∼4× for multiplication [2]. Memory access cost increases by ∼10× from 8 kB to 1 MB memory with a 64-bit cache [2]. In general, there is a gap between memory storage, bandwidth, compute requirements, and energy consumption of modern DNNs and hardware resources available on edge devices [3].An apparent solution to address this gap is to compress such networks, thus reducing the compute requirements to match putative edge resources. Several groups have proposed compressed new compute-and memory-efficient DNN architectures [4]-[6] and parameter-efficient neural networks, using methods such as DNN pruning [7], distillation [8], and low-precision arithmetic [9], [10]. Among these approaches, low-precision arithmetic is noted for its ability to reduce memory capacity, bandwidth, latency, and energy consumption associated with MAC units in DNNs and an increase in the level of data parallelism [9], [11], [12].The ultimate goal of low-precision DNN design is to reduce the original hardware complexity of the high-precision DNN model to a level suitable for edge devices without significantly degrading performance.To address the gaps in previous studies, we are motivated to study low-precision posit for DNN training on the edge. II. POSIT NUMERICAL FORMATAn alternative to IEEE-754 floating point numbers, posits were recently introduced and exhibit a tapered-precision char-
We present a novel dynamic configuration technique for deep neural networks that permits step-wise energy-accuracy tradeoffs during runtime. Our configuration technique adjusts the number of channels in the network dynamically depending on response time, power, and accuracy targets. To enable this dynamic configuration technique, we co-design a new training algorithm, where the network is incrementally trained such that the weights in channels trained in earlier steps are fixed. Our technique provides the flexibility of multiple networks while storing and utilizing one set of weights. We evaluate our techniques using both an ASICbased hardware accelerator as well as a low-power embedded GPGPU and show that our approach leads to only a small or negligible loss in the final network accuracy. We analyze the performance of our proposed methodology using three well-known networks for MNIST, CIFAR-10, and SVHN datasets, and we show that we are able to achieve up to 95% energy reduction with less than 1% accuracy loss across the three benchmarks. In addition, compared to prior work on dynamic network reconfiguration, we show that our approach leads to approximately 50% savings in storage requirements, while achieving similar accuracy.
While Deep Neural Networks (DNNs) push the state-ofthe-art in many machine learning applications, they often require millions of expensive floating-point operations for each input classification. This computation overhead limits the applicability of DNNs to low-power, embedded platforms and incurs high cost in data centers. This motivates recent interests in designing low-power, low-latency DNNs based on fixed-point, ternary, or even binary data precision. While recent works in this area offer promising results, they often lead to large accuracy drops when compared to the floating-point networks. We propose a novel approach to map floating-point based DNNs to 8-bit dynamic fixedpoint networks with integer power-of-two weights with no change in network architecture. Our dynamic fixed-point DNNs allow different radix points between layers. During inference, power-of-two weights allow multiplications to be replaced with arithmetic shifts, while the 8-bit fixed-point representation simplifies both the buffer and adder design. In addition, we propose a hardware accelerator design to achieve low-power, low-latency inference with insignificant degradation in accuracy. Using our custom accelerator design with the CIFAR-10 and ImageNet datasets, we show that our method achieves significant power and energy savings while increasing the classification accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.