Massimo Giordano scite author profile

Neural-network training can be slow and energy intensive, owing to the need to transfer the weight data for the network between conventional digital memory chips and processor chips. Analogue non-volatile memory can accelerate the neural-network training algorithm known as backpropagation by performing parallelized multiply-accumulate operations in the analogue domain at the location of the weight data. However, the classification accuracies of such in situ training using non-volatile-memory hardware have generally been less than those of software-based training, owing to insufficient dynamic range and excessive weight-update asymmetry. Here we demonstrate mixed hardware-software neural-network implementations that involve up to 204,900 synapses and that combine long-term storage in phase-change memory, near-linear updates of volatile capacitors and weight-data transfer with 'polarity inversion' to cancel out inherent device-to-device variations. We achieve generalization accuracies (on previously unseen data) equivalent to those of software-based training on various commonly used machine-learning test datasets (MNIST, MNIST-backrand, CIFAR-10 and CIFAR-100). The computational energy efficiency of 28,065 billion operations per second per watt and throughput per area of 3.6 trillion operations per second per square millimetre that we calculate for our implementation exceed those of today's graphical processing units by two orders of magnitude. This work provides a path towards hardware accelerators that are both fast and energy efficient, particularly on fully connected neural-network layers.

show abstract

Perspective on training fully connected networks with resistive memories: Device requirements for multiple conductances of varying significance

Cristiano

Giordano

Ambrogio

et al. 2018

View full text Add to dashboard Cite

Novel Deep Neural Network (DNN) accelerators based on crossbar arrays of non-volatile memories (NVMs)—such as Phase-Change Memory or Resistive Memory—can implement multiply-accumulate operations in a highly parallelized fashion. In such systems, computation occurs in the analog domain at the location of weight data encoded into the conductances of the NVM devices. This allows DNN training of fully-connected layers to be performed faster and with less energy. Using a mixed-hardware-software experiment, we recently showed that by encoding each weight into four distinct physical devices—a “Most Significant Conductance” pair (MSP) and a “Least Significant Conductance” pair (LSP)—we can train DNNs to software-equivalent accuracy despite the imperfections of real analog memory devices. We surmised that, by dividing the task of updating and maintaining weight values between the two conductance pairs, this approach should significantly relax the otherwise quite stringent device requirements. In this paper, we quantify these relaxed requirements for analog memory devices exhibiting a saturating conductance response, assuming either an immediate or a delayed steep initial slope in conductance change. We discuss requirements on the LSP imposed by the “Open Loop Tuning” performed after each training example and on the MSP due to the “Closed Loop Tuning” performed periodically for weight transfer between the conductance pairs. Using simulations to evaluate the final generalization accuracy of a trained four-neuron-layer fully-connected network, we quantify the required dynamic range (as controlled by the size of the steep initial jump), the tolerable device-to-device variability in both maximum conductance and maximum conductance change, the tolerable pulse-to-pulse variability in conductance change, and the tolerable device yield, for both the LSP and MSP devices. We also investigate various Closed Loop Tuning strategies and describe the impact of the MSP/LSP approach on device endurance.

show abstract

Training fully connected networks with resistive memories: impact of device failures

et al. 2019

View full text Add to dashboard Cite

show abstract

CHIMERA: A 0.92 TOPS, 2.2 TOPS/W Edge AI Accelerator with 2 MByte On-Chip Foundry Resistive RAM for Efficient Training and Inference

Giordano

Prabhu

Koul

et al. 2021

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.