C.-W. Cheng scite author profile

Neural-network training can be slow and energy intensive, owing to the need to transfer the weight data for the network between conventional digital memory chips and processor chips. Analogue non-volatile memory can accelerate the neural-network training algorithm known as backpropagation by performing parallelized multiply-accumulate operations in the analogue domain at the location of the weight data. However, the classification accuracies of such in situ training using non-volatile-memory hardware have generally been less than those of software-based training, owing to insufficient dynamic range and excessive weight-update asymmetry. Here we demonstrate mixed hardware-software neural-network implementations that involve up to 204,900 synapses and that combine long-term storage in phase-change memory, near-linear updates of volatile capacitors and weight-data transfer with 'polarity inversion' to cancel out inherent device-to-device variations. We achieve generalization accuracies (on previously unseen data) equivalent to those of software-based training on various commonly used machine-learning test datasets (MNIST, MNIST-backrand, CIFAR-10 and CIFAR-100). The computational energy efficiency of 28,065 billion operations per second per watt and throughput per area of 3.6 trillion operations per second per square millimetre that we calculate for our implementation exceed those of today's graphical processing units by two orders of magnitude. This work provides a path towards hardware accelerators that are both fast and energy efficient, particularly on fully connected neural-network layers.

show abstract

Conductive-bridge memory (CBRAM) with excellent high-temperature retention

Jameson¹,

Blanchard²,

Cheng³

et al. 2013

View full text Add to dashboard Cite

show abstract

Ultra-High Endurance and Low I<inf>OFF</inf> Selector based on AsSeGe Chalcogenides for Wide Memory Window 3D Stackable Crosspoint Memory

Cheng

Chien

Kuo

et al. 2018

View full text Add to dashboard Cite

Towards large size substrates for III-V co-integration made by direct wafer bonding on Si

Daix

Uccelli

Czornomaz

et al. 2014

View full text Add to dashboard Cite

We report the first demonstration of 200 mm InGaAs-on-insulator (InGaAs-o-I) fabricated by the direct wafer bonding technique with a donor wafer made of III-V heteroepitaxial structure grown on 200 mm silicon wafer. The measured threading dislocation density of the In0.53Ga0.47As (InGaAs) active layer is equal to 3.5 × 109 cm−2, and it does not degrade after the bonding and the layer transfer steps. The surface roughness of the InGaAs layer can be improved by chemical-mechanical-polishing step, reaching values as low as 0.4 nm root-mean-square. The electron Hall mobility in 450 nm thick InGaAs-o-I layer reaches values of up to 6000 cm2/Vs, and working pseudo-MOS transistors are demonstrated with an extracted electron mobility in the range of 2000–3000 cm2/Vs. Finally, the fabrication of an InGaAs-o-I substrate with the active layer as thin as 90 nm is achieved with a Buried Oxide of 50 nm. These results open the way to very large scale production of III-V-o-I advanced substrates for future CMOS technology nodes.

show abstract

High Performance InGaAs Gate-All-Around Nanosheet FET on Si Using Template Assisted Selective Epitaxy

Lee

Cheng

Sun

et al. 2018

View full text Add to dashboard Cite

Perspective on training fully connected networks with resistive memories: Device requirements for multiple conductances of varying significance

Cristiano

Giordano

Ambrogio

et al. 2018

View full text Add to dashboard Cite

Novel Deep Neural Network (DNN) accelerators based on crossbar arrays of non-volatile memories (NVMs)—such as Phase-Change Memory or Resistive Memory—can implement multiply-accumulate operations in a highly parallelized fashion. In such systems, computation occurs in the analog domain at the location of weight data encoded into the conductances of the NVM devices. This allows DNN training of fully-connected layers to be performed faster and with less energy. Using a mixed-hardware-software experiment, we recently showed that by encoding each weight into four distinct physical devices—a “Most Significant Conductance” pair (MSP) and a “Least Significant Conductance” pair (LSP)—we can train DNNs to software-equivalent accuracy despite the imperfections of real analog memory devices. We surmised that, by dividing the task of updating and maintaining weight values between the two conductance pairs, this approach should significantly relax the otherwise quite stringent device requirements. In this paper, we quantify these relaxed requirements for analog memory devices exhibiting a saturating conductance response, assuming either an immediate or a delayed steep initial slope in conductance change. We discuss requirements on the LSP imposed by the “Open Loop Tuning” performed after each training example and on the MSP due to the “Closed Loop Tuning” performed periodically for weight transfer between the conductance pairs. Using simulations to evaluate the final generalization accuracy of a trained four-neuron-layer fully-connected network, we quantify the required dynamic range (as controlled by the size of the steep initial jump), the tolerable device-to-device variability in both maximum conductance and maximum conductance change, the tolerable pulse-to-pulse variability in conductance change, and the tolerable device yield, for both the LSP and MSP devices. We also investigate various Closed Loop Tuning strategies and describe the impact of the MSP/LSP approach on device endurance.

show abstract

Mushroom-Type phase change memory with projection liner: An array-level demonstration of conductance drift and noise mitigation

Bruce¹,

Sarwat

Boybat

et al. 2021

View full text Add to dashboard Cite

Comprehensive Scaling Study on 3D Cross-Point PCM toward 1Znm Node for SCM Applications

Chien

Yeh

et al. 2019

View full text Add to dashboard Cite

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

C.-W. Cheng

Equivalent-accuracy accelerated neural-network training using analogue memory

Conductive-bridge memory (CBRAM) with excellent high-temperature retention

Ultra-High Endurance and Low I<inf>OFF</inf> Selector based on AsSeGe Chalcogenides for Wide Memory Window 3D Stackable Crosspoint Memory

Towards large size substrates for III-V co-integration made by direct wafer bonding on Si

High Performance InGaAs Gate-All-Around Nanosheet FET on Si Using Template Assisted Selective Epitaxy

Perspective on training fully connected networks with resistive memories: Device requirements for multiple conductances of varying significance

Mushroom-Type phase change memory with projection liner: An array-level demonstration of conductance drift and noise mitigation

Comprehensive Scaling Study on 3D Cross-Point PCM toward 1Znm Node for SCM Applications

Contact Info

Product

Resources

About