We propose a method for minimizing global buffer access within a deep learning accelerator for convolution operations by maximizing the data reuse through a local register file, thereby substituting the local register file access for the power-hungry global buffer access. To fully exploit the merits of data reuse, this study proposes a rearrangement of the computational sequence in a deep learning accelerator. Once input data are read from the global buffer, repeatedly reading the same data is performed only through the local register file, saving significant power consumption. Furthermore, different from prior works that equip local register files in each computation unit, the proposed method enables sharing a local register file along the column of the 2D computation array, saving resources and controlling overhead. The proposed accelerator is implemented on an off-the-shelf field-programmable gate array to verify the functionality and resource utilization. Then, the performance improvement of the proposed method is demonstrated relative to popular deep learning accelerators. Our evaluation indicates that the proposed deep learning accelerator reduces the number of global-buffer accesses to nearly 86.8%, consequently saving up to 72.3% of the power consumption for the input data memory access with a minor increase in resource usage compared to a conventional deep learning accelerator.
In a channel state information (CSI) based indoor positioning system, the positioning performance becomes susceptible to multipath fading effects especially in non-line-of-sight environments. We propose a transformer-based indoor positioning system (TIPS) to address this challenge. The proposed TIPS utilizes a self-attention mechanism to process the continuous WiFi CSI observed from predetermined routes as fingerprints in a given indoor environment. Each route is then considered a sentence, whereas the position along the route is treated as a word in terms of natural language processing. Consequently, the problem of predicting the position with the fingerprints can then be considered the task of predicting the current word with previous words, which can be efficiently solved using the proposed TIPS. In order to fully exploit the relations among positions, we propose embedding the information of the direction of arrival (DoA) on top of the collected CSI as inputs to the TIPS. Thus, the transformer of the proposed TIPS can better capture the dependencies of the positions in the route and significantly boost positioning accuracy. To exhibit the superiority of the proposed TIPS in a radio frequency (RF) environment, we demonstrate a hardware implementation of an RF testbed consisting of an emulator of WiFi access point and user equipment. Through extensive computer simulations and experimental tests, it is demonstrated that the proposed TIPS can reduce the positioning error down to 20 cm, which is a significant improvement compared to the current state-ofthe-art models.INDEX TERMS CSI, DoA, indoor positioning system, transformer.
This paper presents a novel method for minimizing the power consumption of weight data movements required by a convolutional operation performed on a two-dimensional multiplier–accumulator (MAC) array of a deep neural-network accelerator. The proposed technique employs a local register file (LRF) at each MAC unit in a manner such that once weight pixels are read from the global buffer into the LRF, they are reused from the LRF as many times as desired instead of being repeatedly fetched from the global buffer in each convolutional operation. One of the most evident merits of the proposed method is that the procedure is completely free from the burden of data transfer between neighboring MAC units. It was found from our simulations that the proposed method provides a power saving of approximately 83.33% and 97.62% compared with the power savings recorded by the conventional methods, respectively, when the dimensions of the input data matrix and weight matrix are 128 × 128 and 5 × 5, respectively. The power savings increase as the dimensions of the input data matrix or weight matrix increase.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.