DARTS: Differentiable Architecture Search

Liu, Hanxiao; Simonyan, Karen; Yang, Yiming

doi:10.48550/arxiv.1806.09055

Cited by 623 publications

(1,525 citation statements)

References 20 publications

Supporting

Mentioning

1,518

Contrasting

Unclassified

Order By: Relevance

“…However, such methods require training the searched architecture from scratch for each search step, which is extremely computationally expensive. To address this, weight-sharing approaches have been proposed [4,7,8,10,29,55,66,80,86,93,103]. They train the supernet once which includes all architecture candidates.…”

Section: Neural Architecture Searchmentioning

confidence: 99%

“…They train the supernet once which includes all architecture candidates. For instance, Darts [55] jointly optimizes the network parameters and the importance of each architecture candidate. Also, SPOS [29] trains the weight parameters with uniform forward path sampling and finds the optimal architecture via evolutionary strategy.…”

Section: Neural Architecture Searchmentioning

confidence: 99%

“…For the first question, we highlight that the mainstream NAS algorithms either require multiple training stages [2,78,[107][108][109] or require training a supernet once with all architecture candidates [8,29,55,86] which takes longer training time to converge than standard training. As SNNs have a significantly slower training process compared to ANNs (e.g., training SNN on MNIST with NVIDIA V100 GPU takes 11.43× more latency compared to the same ANN architecture [53]), the above NAS approaches are difficult to be applied on SNNs.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Neural Architecture Search for Spiking Neural Networks

Kim¹,

Li²,

Park³

et al. 2022

Preprint

View full text Add to dashboard Cite

Spiking Neural Networks (SNNs) have gained huge attention as a potential energy-efficient alternative to conventional Artificial Neural Networks (ANNs) due to their inherent high-sparsity activation. However, most prior SNN methods use ANN-like architectures (e.g., VGG-Net or ResNet), which could provide sub-optimal performance for temporal sequence processing of binary information in SNNs. To address this, in this paper, we introduce a novel Neural Architecture Search (NAS) approach for finding better SNN architectures. Inspired by recent NAS approaches that find the optimal architecture from activation patterns at initialization, we select the architecture that can represent diverse spike activation patterns across different data samples without training. Furthermore, to leverage the temporal correlation among the spikes, we search for feed forward connections as well as backward connections (i.e., temporal feedback connections) between layers. Interestingly, SNASNet found by our search algorithm achieves higher performance with backward connections, demonstrating the importance of designing SNN architecture for suitably using temporal information. We conduct extensive experiments on three image recognition benchmarks where we show that SNASNet achieves state-of-the-art performance with significantly lower timesteps (5 timesteps). The code has been released at https://github.com/Intelligent-Computing-Lab-Yale/Neural-Architecture-Search-for-Spiking-Neural-Networks.

show abstract

Section: Neural Architecture Searchmentioning

confidence: 99%

Section: Neural Architecture Searchmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Neural Architecture Search for Spiking Neural Networks

Kim¹,

Li²,

Park³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Neural architecture search (NAS): Studies in the NAS domain [8,11,10,12,14,15,16,17,18] target automatic and fast design of DL models for the task in hand. pDarts [6] is an optimized version of the Darts algorithm [13] which is one of the most benchmarked algorithms in the NAS domain. In pDarts, a network is formed by stacking multiple cells together.…”

Section: ℳMmentioning

confidence: 99%

PaRT: Parallel Learning Towards Robust and Transparent AI

Paknezhad¹,

Rengarajan²,

Yuan³

et al. 2022

Preprint

View full text Add to dashboard Cite

This paper takes a parallel learning approach for robust and transparent AI. A deep neural network is trained in parallel on multiple tasks, where each task is trained only on a subset of the network resources. Each subset consists of network segments, that can be combined and shared across specific tasks. Tasks can share resources with other tasks, while having independent task-related network resources. Therefore, the trained network can share similar representations across various tasks, while also enabling independent task-related representations. The above allows for some crucial outcomes. (1) The parallel nature of our approach negates the issue of catastrophic forgetting. (2) The sharing of segments uses network resources more efficiently. (3) We show that the network does indeed use learned knowledge from some tasks in other tasks, through shared representations. (4) Through examination of individual task-related and shared representations, the model offers transparency in the network and in the relationships across tasks in a multi-task setting. Evaluation of the proposed approach against complex competing approaches such as Continual Learning, Neural Architecture Search, and Multi-task learning shows that it is capable of learning robust representations. This is the first effort to train a DL model on multiple tasks in parallel. Our code is available at https://github.com/MahsaPaknezhad/PaRT.

show abstract

“…After relaxation, the goal is to jointly learn the architecture and the weights within all the mixed operations by solving a bilevel optimization problem, which can be posed as to find the mixing probabilities so that validation loss is minimized given weights that are already optimized on the training set. At the end of the search, the architecture is obtained by replacing each mixed operation with the most likely one [10].…”

Section: Architecture Searching and Beyondmentioning

confidence: 99%

Neural Architecture Search for Inversion

Zhan¹,

Zhang²,

Xin³

et al. 2022

Preprint

View full text Add to dashboard Cite

Over the year, people have been using deep learning to tackle inversion problems, and we see the framework has been applied to build relationship between recording wavefield and velocity (Yang et al., 2016). Here we will extend the work from 2 perspectives, one is deriving a more appropriate loss function, as we know, pixel-2-pixel comparison might not be the best choice to characterize image structure, and we will elaborate on how to construct cost function to capture high level feature to enhance the model performance. Another dimension is searching for the more appropriate neural architecture, which can be viewed as a subproblem within hyperparameter optimization, which is a subset of an even bigger picture, the automatic machine learning, or AutoML. There are several famous networks, U-net, ResNet (He et al. 2016) and DenseNet (Huang et al., 2017), and they achieve phenomenal results for certain problems, yet it's hard to argue they are the best for inversion problems without thoroughly searching within certain space. Here we will be showing our architecture search results for inversion.

show abstract

DARTS: Differentiable Architecture Search

Cited by 623 publications

References 20 publications

Neural Architecture Search for Spiking Neural Networks

Neural Architecture Search for Spiking Neural Networks

PaRT: Parallel Learning Towards Robust and Transparent AI

Neural Architecture Search for Inversion

Contact Info

Product

Resources

About