Accurate Inference With Inaccurate RRAM Devices: A Joint Algorithm-Design Solution

Charan, Gouranga; Mohanty, Abinash; Du, Xiaocong; Krishnan, Gokul; Joshi, Rajiv V.; Cao, Yu

doi:10.1109/jxcdc.2020.2987605

Cited by 29 publications

(32 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This would be a fundamentally necessary feature for synapse weight to learn or classify the characteristics of image patterns with analog levels. These synaptic properties related to quantization are generally consistent with those reported in the literature although it may be slightly affected by the neural network structure, the number of parameters and the complexity of the image dataset [32][33]. The weight values from software training can be converted into conductivity of synaptic devices.…”

Section: Resultssupporting

confidence: 81%

A Fast Weight Transfer Method for Real-Time Online Learning in RRAM-Based Neuromorphic System

et al. 2022

View full text Add to dashboard Cite

show abstract

Section: Resultssupporting

confidence: 81%

A Fast Weight Transfer Method for Real-Time Online Learning in RRAM-Based Neuromorphic System

et al. 2022

View full text Add to dashboard Cite

show abstract

“…However, these large models require an order of millions of multiply-accumulate operations, which are fundamentally computational intensive operations [6]. These data-intensive operations and a large number of model parameters due to the model size mean that these models would require large memory and memory bandwidth in order to achieve reasonable performance [7]- [10]. As a result, these deep models cannot be deployed on resource-constrained edge computing devices with limited computing resources and power budget [7], [11] such as battery-powered mobile and internet of things (IoT) devices.…”

Section: Introductionmentioning

confidence: 99%

“…Hence, DNN models with L 1 /T opK BatchNorm has excellent noise-resistant property than DNN models with L 2 BatchNorm. Furthermore, the relation between the loss gradient of the weight of a model with BatchNorm and those without BatchNorm is shown in equation (10) [24] where L and L are the loss of the model with and without BatchNorm, respectively, σ j is the standard deviation of the BatchNorm, γ is the BatchNorm layer trainable parameter, where y j and ŷj are the output of the model with and without BatchNorm, respectively,∇ is the gradient, and m is the batch size.…”

Section: Introductionmentioning

confidence: 99%

Impact of L1 Batch Normalization on Analog Noise Resistant Property of Deep Learning Models

Fagbohungbe¹,

Qian²

2022

Preprint

View full text Add to dashboard Cite

Analog hardware has become a popular choice for machine learning on resource-constrained devices recently due to its fast execution and energy efficiency. However, the inherent presence of noise in analog hardware and the negative impact of the noise on deployed deep neural network (DNN) models limit their usage. The degradation in performance due to the noise calls for the novel design of DNN models that have excellent noiseresistant property, leveraging the properties of the fundamental building block of DNN models. In this work, the use of L1 or T opK BatchNorm type, a fundamental DNN model building block, in designing DNN models with excellent noise-resistant property is proposed. Specifically, a systematic study has been carried out by training DNN models with L1/T opK BatchNorm type, and the performance is compared with DNN models with L2 BatchNorm types. The resulting model noise-resistant property is tested by injecting additive noise to the model weights and evaluating the new model inference accuracy due to the noise. The results show that L1 and T opK BatchNorm type has excellent noise-resistant property, and there is no sacrifice in performance due to the change in the BatchNorm type from L2 to L1/T opK BatchNorm type.

show abstract

“…Despite their unprecedented level of performance and improvement in their design in recent years, DL models require high computational and energy resources during training and inference [2], [3]. The high computational resource requirement is because of the intense fundamental operations by these models, such as dot product of vector and matrix, and multiplications of matrices, during training and inference [4], [5]. This is further complicated by the large increase in the quantity of these operations with the increase in the size of the models.…”

Section: Introductionmentioning

confidence: 99%

“…This tight requirement and the desire to fix compute and memory transfer bottlenecks in current set of hardware has led to a significant interest in analog specialized hardware for DL, as they have the potential to deliver at least 2X better performance than the conventional digital hardware in both speed and energy efficiency [8], [9]. In fact, they can deliver at projected throughput of multiple tera-operations (TOPs) per seconds and also achieve femtojoule energy budgets per multiply-and-accumulate (MAC) operation [5], [10]- [12]. The improvement can be attributed to the use of non-volatile memory cross bar arrays to encode DL model weights and biases, a form of computing known as in-memory computing.…”

Section: Introductionmentioning

confidence: 99%

Benchmarking Inference Performance of Deep Learning Models on Analog Devices

Fagbohungbe¹,

Qian²

2020

Preprint

View full text Add to dashboard Cite

Analog hardware implemented deep learning models are promising for computation and energy constrained systems such as edge computing devices. However, the analog nature of the device and the associated many noise sources will cause changes to the value of the weights in the trained deep learning models deployed on such devices. In this study, systematic evaluation of the inference performance of trained popular deep learning models for image classification deployed on analog devices has been carried out, where additive white Gaussian noise has been added to the weights of the trained models during inference. It is observed that deeper models and models with more redundancy in design such as VGG are more robust to the noise in general. However, the performance is also affected by the design philosophy of the model, the detailed structure of the model, the exact machine learning task, as well as the datasets.

show abstract

Accurate Inference With Inaccurate RRAM Devices: A Joint Algorithm-Design Solution

Cited by 29 publications

References 29 publications

A Fast Weight Transfer Method for Real-Time Online Learning in RRAM-Based Neuromorphic System

A Fast Weight Transfer Method for Real-Time Online Learning in RRAM-Based Neuromorphic System

Impact of L1 Batch Normalization on Analog Noise Resistant Property of Deep Learning Models

Benchmarking Inference Performance of Deep Learning Models on Analog Devices

Contact Info

Product

Resources

About