Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective

Chen, Wuyang; Gong, Xinyu; Wang, Zhangyang

doi:10.48550/arxiv.2102.11535

Cited by 13 publications

(21 citation statements)

References 52 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In very recent works, the key focus has been the efficiency of the NAS technique [1,91,92,94,105] owing to the growing size of dataset and architecture. Interestingly, a line of work suggests the concept of NAS without training where the networks do not require training during the search stage [12,58,88]. This can significantly reduce the computational cost for searching optimal architecture.…”

Section: Neural Architecture Searchmentioning

confidence: 99%

“…This makes it difficult to search for an optimal SNN architecture with NAS techniques that train the architecture candidate multiple times [2,78,[107][108][109] or train a complex supernet [8,29,55,86]. To minimize the training budget, our work is motivated by the previous works [12,58,88] which demonstrate that the optimal architecture can be founded without any training process. Specifically, Mellor et al [58] provide the interesting observation that the architecture having distinctive representations across different data samples is likely to achieve higher posttraining performance.…”

Section: Nas Without Trainingmentioning

confidence: 99%

“…NAS without training approaches in ANN domain [12,58] are based on the theoretical concept of linear region in neural networks [32,33,59,67,87]. That is, each piecewise linear function (such as, ReLU) divides the input space into multiple linear regions.…”

Section: Linear Regions From Lif Neuronsmentioning

confidence: 99%

“…As SNNs have a significantly slower training process compared to ANNs (e.g., training SNN on MNIST with NVIDIA V100 GPU takes 11.43× more latency compared to the same ANN architecture [53]), the above NAS approaches are difficult to be applied on SNNs. On the other hand, recent works [12,58,88] have proposed efficient NAS approaches that search the best neuron cell from initialized networks without any training. Specifically, [58] shows that the network architecture with a high representation power at initialization is likely to achieve higher post-training accuracy.…”

Section: Introductionmentioning

confidence: 99%

“…For the first time, we showcase a NAS technique for finding better SNN architecture on the image recognition task. (ii) Motivated by the prior work [12,58,88], we find an SNN-friendly architecture by comparing temporal activation without any training process. Eliminating the training cost to find the optimal architecture brings a huge advantage for SNNs that require significantly longer training time compared to ANNs.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Neural Architecture Search for Spiking Neural Networks

Kim¹,

Li²,

Park³

et al. 2022

Preprint

View full text Add to dashboard Cite

Spiking Neural Networks (SNNs) have gained huge attention as a potential energy-efficient alternative to conventional Artificial Neural Networks (ANNs) due to their inherent high-sparsity activation. However, most prior SNN methods use ANN-like architectures (e.g., VGG-Net or ResNet), which could provide sub-optimal performance for temporal sequence processing of binary information in SNNs. To address this, in this paper, we introduce a novel Neural Architecture Search (NAS) approach for finding better SNN architectures. Inspired by recent NAS approaches that find the optimal architecture from activation patterns at initialization, we select the architecture that can represent diverse spike activation patterns across different data samples without training. Furthermore, to leverage the temporal correlation among the spikes, we search for feed forward connections as well as backward connections (i.e., temporal feedback connections) between layers. Interestingly, SNASNet found by our search algorithm achieves higher performance with backward connections, demonstrating the importance of designing SNN architecture for suitably using temporal information. We conduct extensive experiments on three image recognition benchmarks where we show that SNASNet achieves state-of-the-art performance with significantly lower timesteps (5 timesteps). The code has been released at https://github.com/Intelligent-Computing-Lab-Yale/Neural-Architecture-Search-for-Spiking-Neural-Networks.

show abstract

Section: Neural Architecture Searchmentioning

confidence: 99%

Section: Nas Without Trainingmentioning

confidence: 99%

Section: Linear Regions From Lif Neuronsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Neural Architecture Search for Spiking Neural Networks

Kim¹,

Li²,

Park³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

Towards Physical Plausibility in Neuroevolution Systems

Cortês,

Lourenço,

Machado

2024

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Fisher Task Distance and its Application in Neural Architecture Search

Soltani

Dong

et al. 2022

IEEE Access

View full text Add to dashboard Cite

We formulate an asymmetric (or non-commutative) distance between tasks based on Fisher Information Matrices, called Fisher task distance. This distance represents the complexity of transferring the knowledge of one task to another. We provide a proof of consistency for our distance through theorems and experiments on various classification tasks from MNIST, CIFAR-10, CIFAR-100, ImageNet, and Taskonomy datasets. Next, we construct an online neural architecture search framework using the Fisher task distance, in which we have access to the past learned tasks. By using the Fisher task distance, we can identify the closest learned tasks to the target task, and utilize the knowledge learned from these related tasks on the target task. Here, we show how the proposed distance between a target task and a set of learned tasks can be used to reduce the neural architecture search space for the target task. The complexity reduction in search space for task-specific architectures is achieved by building on the optimized architectures for similar tasks instead of doing a full search and without using this side information. Experimental results for tasks in MNIST, CIFAR-10, CIFAR-100, ImageNet datasets demonstrate the efficacy of the proposed approach and its improvements, in terms of the performance and the number of parameters, over other gradient-based search methods, such as ENAS, DARTS, PC-DARTS.

show abstract

Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective

Cited by 13 publications

References 52 publications

Neural Architecture Search for Spiking Neural Networks

Neural Architecture Search for Spiking Neural Networks

Towards Physical Plausibility in Neuroevolution Systems

Fisher Task Distance and its Application in Neural Architecture Search

Contact Info

Product

Resources

About