Quantisation and pooling method for low‐inference‐latency spiking neural networks

Lin, Zhitao; Shen, Juncheng; De, Ma; Meng, Jianyi

doi:10.1049/el.2017.2219

Cited by 12 publications

(3 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…When applying NPTD , if the pruning thresholds are individually assigned to all the neurons, search space becomes prohibitively large and the memories to store the pruning thresholds should be large as well. Referred from the previous literatures ( Bengio et al, 2006 ; Lin et al, 2017 ; Wang et al, 2019 ), where layer-wise search algorithms are utilized to find optimum design points such as bit-widths of quantization or approximation parameters, pruning thresholds of the NPTD are searched per layer in this work.…”

Section: Neuron Pruning In Temporal Domains ( Nptd )mentioning

confidence: 99%

Neuron pruning in temporal domain for energy efficient SNN processor design

Lew,

Tang,

Park

2023

Front. Neurosci.

View full text Add to dashboard Cite

Recently, the accuracy of spike neural network (SNN) has been significantly improved by deploying convolutional neural networks (CNN) and their parameters to SNN. The deep convolutional SNNs, however, suffer from large amounts of computations, which is the major bottleneck for energy efficient SNN processor design. In this paper, we present an input-dependent computation reduction approach, where relatively unimportant neurons are identified and pruned without seriously sacrificing the accuracies. Specifically, a neuron pruning in temporal domain is proposed that prunes less important neurons and skips its future operations based on the layer-wise pruning thresholds of membrane voltages. To find the pruning thresholds, two pruning threshold search algorithms are presented that can efficiently trade-off accuracy and computational complexity with a given computation reduction ratio. The proposed neuron pruning scheme has been implemented using 65 nm CMOS process. The SNN processor achieves a 57% energy reduction and a 2.68× speed up, with up to 0.82% accuracy loss and 7.3% area overhead for CIFAR-10 dataset.

show abstract

Section: Neuron Pruning In Temporal Domains ( Nptd )mentioning

confidence: 99%

Neuron pruning in temporal domain for energy efficient SNN processor design

Lew,

Tang,

Park

2023

Front. Neurosci.

View full text Add to dashboard Cite

show abstract

“…However, even in the spike domain, the MaxPooling tends to produce higher classification accuracy (Rueckauer et al, 2017 ) than the aforementioned alternatives. When implementing the spiking MaxPooling, researchers are drawn to several popular approaches: rate-based spike accumulation (Hu and Pfeiffer, 2016 ; Chen et al, 2018 ; Kim et al, 2020 ), time-to-first-spike (Masquelier and Thorpe, 2007 ; Zhao et al, 2014 ; Li J. et al, 2017 ; Mozafari et al, 2019 ), and lateral inhibition or temporal winner-take-all (Orchard et al, 2015 ; Lin et al, 2017 ).…”

Section: Sub-sampling By Pooling Operationmentioning

confidence: 99%

Spiking CMOS-NVM mixed-signal neuromorphic ConvNet with circuit- and training-optimized temporal subsampling

Dorzhigulov

Saxena

2023

Front. Neurosci.

View full text Add to dashboard Cite

We increasingly rely on deep learning algorithms to process colossal amount of unstructured visual data. Commonly, these deep learning algorithms are deployed as software models on digital hardware, predominantly in data centers. Intrinsic high energy consumption of Cloud-based deployment of deep neural networks (DNNs) inspired researchers to look for alternatives, resulting in a high interest in Spiking Neural Networks (SNNs) and dedicated mixed-signal neuromorphic hardware. As a result, there is an emerging challenge to transfer DNN architecture functionality to energy-efficient spiking non-volatile memory (NVM)-based hardware with minimal loss in the accuracy of visual data processing. Convolutional Neural Network (CNN) is the staple choice of DNN for visual data processing. However, the lack of analog-friendly spiking implementations and alternatives for some core CNN functions, such as MaxPool, hinders the conversion of CNNs into the spike domain, thus hampering neuromorphic hardware development. To address this gap, in this work, we propose MaxPool with temporal multiplexing for Spiking CNNs (SCNNs), which is amenable for implementation in mixed-signal circuits. In this work, we leverage the temporal dynamics of internal membrane potential of Integrate & Fire neurons to enable MaxPool decision-making in the spiking domain. The proposed MaxPool models are implemented and tested within the SCNN architecture using a modified version of the aihwkit framework, a PyTorch-based toolkit for modeling and simulating hardware-based neural networks. The proposed spiking MaxPool scheme can decide even before the complete spatiotemporal input is applied, thus selectively trading off latency with accuracy. It is observed that by allocating just 10% of the spatiotemporal input window for a pooling decision, the proposed spiking MaxPool achieves up to 61.74% accuracy with a 2-bit weight resolution in the CIFAR10 dataset classification task after training with back propagation, with only about 1% performance drop compared to 62.78% accuracy of the 100% spatiotemporal window case with the 2-bit weight resolution to reflect foundry-integrated ReRAM limitations. In addition, we propose the realization of one of the proposed spiking MaxPool techniques in an NVM crossbar array along with periphery circuits designed in a 130nm CMOS technology. The energy-efficiency estimation results show competitive performance compared to recent neuromorphic chip designs.

show abstract

“…To improve the real-time performance of SNN, [8] proposed two optimization methods to normalize the network weights, namely model-based normalization 205 and data-based normalization, so that the neuron activations were sufficiently small to prevent from overestimating output activations. Retraining based layer-wise quantization method to quantize the neuron activation and pooling layer incorporation to reduce the number requirement of neurons were pro-210 posed in [25], the authors reported that these methods can build hardware-friendly SNNs with ultra-low-inference latency.…”

Section: Inference Latency 185mentioning

confidence: 99%

Improved integrate-and-fire neuron models for inference acceleration of spiking neural networks

Zhou¹,

Zhang

2020

Appl Intell

View full text Add to dashboard Cite

This paper studies the effects of different bio-synaptic membrane potential mechanisms on the inference speed of both spiking feedforward neural networks (SFNNs) and spiking convolutional neural networks (SCNNs). These mechanisms inspired by biological neuron phenomenon, such as electronic conduction in neurons, chemical neurotransmitter attenuation between presynaptic and postsynaptic neurons, are considered to be modeled in mathematical and applied to artificial spiking networks. In the field of spiking neural networks, we model some biological neural membrane potential updating strategies based on integrate-and-fire (I&F) spiking neuron, which includes spiking neuron model with membrane potential decay (MemDec), spiking neuron model with synaptic input current superposition at spiking time (SynSup) and spiking neuron model with synaptic input current accumulation(SynAcc). Experiment results show that compared with the general I&F model (one of the most commonly used spiking neuron models), SynSup and SynAcc can effectively improve the learning speed in the inference stage of SCNNs and SFNNs.

show abstract

Quantisation and pooling method for low‐inference‐latency spiking neural networks

Cited by 12 publications

References 4 publications

Neuron pruning in temporal domain for energy efficient SNN processor design

Neuron pruning in temporal domain for energy efficient SNN processor design

Spiking CMOS-NVM mixed-signal neuromorphic ConvNet with circuit- and training-optimized temporal subsampling

Improved integrate-and-fire neuron models for inference acceleration of spiking neural networks

Contact Info

Product

Resources

About