There is an increasing demand for a novel computing structure for data-intensive applications such as artificial intelligence and virtual reality. The processing-in-memory (PIM) is a promising alternative to reduce the overhead caused by data movement. Many studies have been conducted on the utilization of the PIM taking advantage of the bandwidth increased by the through silicon via (TSV). One approach is to design an optimized PIM architecture for a specific application, the other is to find the tasks that will be more advantageous when offloading to PIM. The goal of this paper is to make the PIM, a newly introduced technology, be easily applied to various applications. The programmable GPU-based PIM is the target system. The essential but simple task offloading conditions are proposed to secure as many candidate tasks as possible when there is any potential benefit from the PIM. The PIM design options then are explored reflecting the characteristics of the candidate tasks actively. When determining offloading conditions, it is difficult to simultaneously consider three time-energy-power objectives. Thus, the problem is divided into two sub-problems. The first offloading condition is designed based on time-energy constraints, whereas the second offloading condition is modeled to satisfy time-power constraints. During the whole processes, the offloading conditions and the PIM design options are carefully configured in a complementary manner to reduce the tasks that are excluded from the offloading. In the simulation results, the suitability of the modeled two offloading conditions and the proposed PIM design are verified using various benchmarks and then, they are compared with previous works in terms of processing speed and energy. INDEX TERMS High bandwidth memory, near-data-processing, processing-in-memory, task offloading. I. INTRODUCTION Recently, more and more attention is paid to applications that require massive data processing, such as artificial intelligence and virtual reality. The parallelism of the graphic processing unit (GPU) has been used to increase the processing speed for such applications. However, improving efficiency in terms of data movement has not been a big concern [1]. Therefore, there is an increasing demand for a novel computing structure optimized to the execution of data-intensive applications. The near-data-processing (NDP), which puts the processing unit close to the data, is a promising alternative for reducing the overhead caused by data movement. When the processor is near memory, it is defined as a processing-in-memory (PIM). The associate editor coordinating the review of this manuscript and approving it for publication was Cihun-Siyong Gong. An example is the packaging of a processing unit within a memory module or inside a DRAM chip. Recently, the technology of the through silicon via (TSV) enables CPU, GPU or hardware accelerator to be mounted on logic die of 3-dimensional stacked memory. Thanks to this, PIM technology, which vertically stacks DRAM and a logic die with pr...
In this paper, we introduce a low power consumption packet detection scheme based on cross correlation and complex multiplication for MB-OFDM UWB system. The proposed scheme compares received signal power and calculated value explained later. The received signal is filtered by a correlator and then output of the correlator is multiplied with delayed output of the correlator and then we get the calculated value which is absolute value of output of complex multiplier. Filter tap coefficient of the correlator is given as sign value of known preamble. This scheme improves packet detection performance in multi-path fading channel model for MB-OFDM UWB system. Through the simulation for 4 block-fading channels, we describe that the proposed scheme can shrink degradation of packet detection performance due to any sampling timing offset and the fading channel model environment for MB-OFDM UWB system.
In ths paper, we investigate the performance of the W-CDMA system with smart antenna. The realistic wideband channel is assumed, one of which is ITU-R Ml225 channel model. It is also assumed that multipaths are clustered The beardoxming-RAKE receiver struchre of W-CDMA system is proposed, whose performance is analyzed on the assumption of perfect channel estimation The probability density function(p3) of SINR(Signal to Interference and Noise Ratio) for Merent number of antmm and users is presentd, and the BER(Bit Error Rate) is presented based on that. As a result, the prfommce of the W-CDMA system with smart antenna in the real~stic wideband channel has been consi$eraby improved
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.