Continual learning is the ability to acquire a new task or knowledge without losing any previously collected information. Achieving continual learning in artificial intelligence (AI) is currently prevented by catastrophic forgetting, where training of a new task deletes all previously learned tasks. Here, we present a new concept of a neural network capable of combining supervised convolutional learning with bio-inspired unsupervised learning. Brain-inspired concepts such as spike-timing-dependent plasticity (STDP) and neural redundancy are shown to enable continual learning and prevent catastrophic forgetting without compromising standard accuracy achievable with state-of-the-art neural networks. Unsupervised learning by STDP is demonstrated by hardware experiments with a one-layer perceptron adopting phasechange memory (PCM) synapses. Finally, we demonstrate full testing classification of Modified National Institute of Standards and Technology (MNIST) database with an accuracy of 98% and continual learning of up to 30% non-trained classes with 83% average accuracy. INDEX TERMS Catastrophic forgetting, continual learning, convolutional neural network (CNN), neuromorphic engineering, phase-change memory (PCM), spike-timing-dependent plasticity (STDP), supervised learning, unsupervised learning.
Data-intensive computing applications, such as object recognition, time series prediction, and optimization tasks, are becoming increasingly important in several fields, including smart mobility, health, and industry. Because of the large amount of data involved in the computation, the conventional von Neumann architecture suffers from excessive latency and energy consumption due to the memory bottleneck. A more efficient approach consists of in-memory computing (IMC), where computational operations are directly carried out within the data. IMC can take advantage of the rich physics of memory devices, such as their ability to store analog values to be used in matrix-vector multiplication (MVM) and their stochasticity that is highly valuable in the frame of optimization and constraint satisfaction problems (CSPs). This article presents a stochastic spiking neuron based on a phase-change memory (PCM) device for the solution of CSPs within a Hopfield recurrent neural network (RNN). In the RNN, the PCM cell is used as the integrating element of a stochastic neuron, supporting the solution of a typical CSP, namely a Sudoku puzzle in hardware. Finally, the ability to solve Sudoku puzzles using RNNs with PCM-based neurons is studied for increasing size of Sudoku puzzles by a compact simulation model, thus supporting our PCM-based RNN for data-intensive computing. INDEX TERMS Phase change memory (PCM), artificial synapses, hopfield neural network, stochastic process, optimization. I. INTRODUCTION O PTIMIZATION problems are among the most intensive computing tasks for several application fields, such as industry, finance, and transport. In general, optimization is carried out by several iterations to identify the global minimum of a certain cost function. In each iteration, a conventional digital system must access the memory to fetch input data and upload the temporary output, which is time and energy consuming. To enable a more efficient optimization, a non-von Neumann architecture can be adopted to eliminate the latency and energy spent for shuttling the data between the memory and the central processing unit (CPU) [1]. An example of non-von Neumann computing architecture is the concept of in-memory computing (IMC) where the computation is executed directly within the memory array. For instance, IMC can efficiently accelerate the typical multiplyaccumulate (MAC) operation, which is the foundation for modern digital accelerators for artificial intelligence (AI) and optimization [2]. Emerging memory devices, such as phase-change memory (PCM) [3], [4] and resistive random access memory (RRAM) [5], [6], offer scalable, efficient, and CMOS-compatible solutions to store analog information as
Artificial neural networks (ANNs) can outperform the human ability of object recognition by supervised training of synaptic parameters with large datasets. Contrarily to the human brain, however, ANNs cannot continually learn, i.e. acquire new information without catastrophically forgetting previous knowledge. To solve this issue, we present a novel hybrid neural network based on CMOS logic and phase change memory (PCM) synapses, mixing a supervised convolutional neural network (CNN) with bio-inspired unsupervised learning and neuronal redundancy. We demonstrate high classification accuracy in MNIST and CIFAR10 datasets (98% and 85%, respectively) and energy-efficient continual learning of up to 30% of non-trained classes with 83% average accuracy. Continual learning. ANNs, like multi-layer perceptrons and CNNs, show high and stable accuracy for pattern and object recognition [1]. However, as presented in Fig. 1a for the sequential training of 'A' and 'B' subsets in a standard CNN, they lack the necessary 'plasticity' for learning continually [2]. To overcome this limitation, also known as "stability-plasticity dilemma", we propose a novel neural network based on stable trained convolutional filters and bio-inspired unsupervised spike-timing-dependent plasticity (STDP), Fig. 1b [3]. Supervised-unsupervised network. Fig. 2a shows the main blocks of our architecture: (i) the CNN, (ii) the combinational logic and (iii) the unsupervised winner-take-all (WTA) network. First, the CNN is trained with a subset of input patterns, called trained classes, to develop the filters by offline supervised training. Then, the whole network is operated, where also new patterns, called non-trained classes, can be learnt and classified thanks to the unsupervised network (Fig. 2b). In the CNN, input images are convolved with 20x20 trained filters that are extracted from the first convolutional layer of the networks in Fig. 3. Two types of filters are used, the class filters (Fig. 3a), each trained to recognize only a specific class of the dataset, and the feature filters (Fig. 3b), each specialized in extracting a certain feature [4]. The filters were trained with respect to a fixed threshold for directly mapping a specific feature. This generates a pattern of binary responses, which are equal to either V DD , when the feature is found, or 0, when the feature is not found. Fig. 4a shows the circuit used for convolution, exploiting the analogue inmemory matrix-vector multiplication (MVM) with PCM devices [5]. We always used 16 different filters, 7 class filters and 9 feature filters, resulting in a 4x4 feature map. Fig. 4b shows the average feature maps for the MNIST dataset with 3 non-trained classes. These patterns have variable density P (i.e. active signals over total ones); thus, they must be equalized before the unsupervised block. Fig. 4c shows the equalization layer, i.e. a combinational logic with a 4x4 output pattern with P=25% (Fig. 4d). The logic gives higher priority to class filters. Finally, the equalized pattern feeds t...
Memory devices, such as the phase change memory (PCM), have recently shown significant breakthroughs in terms of compactness, 3D stacking capability and speed up for deep learning neural accelerators. However, PCM is affected by the conductance drift, which prevents a precise definition of the synaptic weights in artificial neural networks. Here, we propose an efficient system-level methodology to develop driftresilient multi-layer perceptron (MLP) networks. The procedure guarantees high testing accuracy under conductance drift of the devices and enables the use of only positive weights. We validate the methodology using MNIST, rand-MNIST and Fashion-MNIST datasets, thus offering a roadmap for the implementation of integrated non-volatile memory-based neural networks. We finally analyse the proposed architecture in terms of throughput and energy efficiency. This work highlights the relevance of robust PCM-based design of neural networks for improving the computational capability and optimizing the energetic efficiency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.