2018
DOI: 10.48550/arxiv.1806.09055
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DARTS: Differentiable Architecture Search

Abstract: This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, our method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. Extensive experiments on CIFAR-10, ImageNet, Penn Treebank and WikiText-2 show that our algorithm excels i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

4
1,518
2
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 623 publications
(1,525 citation statements)
references
References 20 publications
4
1,518
2
1
Order By: Relevance
“…However, such methods require training the searched architecture from scratch for each search step, which is extremely computationally expensive. To address this, weight-sharing approaches have been proposed [4,7,8,10,29,55,66,80,86,93,103]. They train the supernet once which includes all architecture candidates.…”
Section: Neural Architecture Searchmentioning
confidence: 99%
See 2 more Smart Citations
“…However, such methods require training the searched architecture from scratch for each search step, which is extremely computationally expensive. To address this, weight-sharing approaches have been proposed [4,7,8,10,29,55,66,80,86,93,103]. They train the supernet once which includes all architecture candidates.…”
Section: Neural Architecture Searchmentioning
confidence: 99%
“…They train the supernet once which includes all architecture candidates. For instance, Darts [55] jointly optimizes the network parameters and the importance of each architecture candidate. Also, SPOS [29] trains the weight parameters with uniform forward path sampling and finds the optimal architecture via evolutionary strategy.…”
Section: Neural Architecture Searchmentioning
confidence: 99%
See 1 more Smart Citation
“…Neural architecture search (NAS): Studies in the NAS domain [8,11,10,12,14,15,16,17,18] target automatic and fast design of DL models for the task in hand. pDarts [6] is an optimized version of the Darts algorithm [13] which is one of the most benchmarked algorithms in the NAS domain. In pDarts, a network is formed by stacking multiple cells together.…”
Section: ℳMmentioning
confidence: 99%
“…After relaxation, the goal is to jointly learn the architecture and the weights within all the mixed operations by solving a bilevel optimization problem, which can be posed as to find the mixing probabilities so that validation loss is minimized given weights that are already optimized on the training set. At the end of the search, the architecture is obtained by replacing each mixed operation with the most likely one [10].…”
Section: Architecture Searching and Beyondmentioning
confidence: 99%