2020
DOI: 10.48550/arxiv.2009.01027
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DARTS-: Robustly Stepping out of Performance Collapse Without Indicators

Xiangxiang Chu,
Xiaoxing Wang,
Bo Zhang
et al.

Abstract: Despite the fast development of differentiable architecture search (DARTS), it suffers from a standing instability issue regarding searching performance, which extremely limits its application. Existing robustifying methods draw clues from the outcome instead of finding out the causing factor. Various indicators such as Hessian eigenvalues are proposed as a signal of performance collapse, and the searching should be stopped once an indicator reaches a preset threshold. However, these methods tend to easily rej… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
26
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(32 citation statements)
references
References 15 publications
0
26
0
Order By: Relevance
“…In a word, we encourage fair competition within each group and avoid unfair competition between groups, making operation search more stable. The candidate operations searched by group annealing operation search are listed in ResNet (He et al, 2016) 22.10 † 1.7 N/A N/A DenseNet-BC (Huang et al, 2017) 17.18 † 25.6 N/A N/A AmoebaNet (Real et al, 2019) 18.93 † 3.1 3150 EA PNAS (Liu et al, 2018a) 19.53 † 3.2 150 SMBO ENAS (Pham et al, 2018) 19.43 † 4.6 0.45 RL DARTS (2nd) 17.54 † 3.4 1 GD GDAS (Dong & Yang, 2019) 18.38 † 3.4 0.2 GD P-DARTS 17.49 † 3.6 0.3 GD DropNAS (Hong et al, 2020) 16.39 ‡ 4.4 0.7 GD DARTS- (Chu et al, 2020) 17.16 ‡ 3.4 0.4 GD DOTS 15.75 4.2 0.2 GD…”
Section: B Discussion About the Differences Between Darts And Dotsmentioning
confidence: 99%
See 1 more Smart Citation
“…In a word, we encourage fair competition within each group and avoid unfair competition between groups, making operation search more stable. The candidate operations searched by group annealing operation search are listed in ResNet (He et al, 2016) 22.10 † 1.7 N/A N/A DenseNet-BC (Huang et al, 2017) 17.18 † 25.6 N/A N/A AmoebaNet (Real et al, 2019) 18.93 † 3.1 3150 EA PNAS (Liu et al, 2018a) 19.53 † 3.2 150 SMBO ENAS (Pham et al, 2018) 19.43 † 4.6 0.45 RL DARTS (2nd) 17.54 † 3.4 1 GD GDAS (Dong & Yang, 2019) 18.38 † 3.4 0.2 GD P-DARTS 17.49 † 3.6 0.3 GD DropNAS (Hong et al, 2020) 16.39 ‡ 4.4 0.7 GD DARTS- (Chu et al, 2020) 17.16 ‡ 3.4 0.4 GD DOTS 15.75 4.2 0.2 GD…”
Section: B Discussion About the Differences Between Darts And Dotsmentioning
confidence: 99%
“…where γ (i,j) = c∈Ex j ,(i,j)∈c β xj c aggregates the weight for edge (i, j) from all its combinations, as shown in Figure 2. To bridge the optimization gap in topology search, we anneal the operation (Pham et al, 2018) 2.89 4.6 0.5 RL NAONet-WS (Luo et al, 2018) 3.53 3.1 0.4 NAO AmoebaNet-B (Real et al, 2019) 2.55± 0.05 2.8 3150 EA Hireachical Evolution (Liu et al, 2018b) 3.75± 0.12 15.7 300 EA PNAS (Liu et al, 2018a) 3.41± 0.09 3.2 225 SMBO DARTS 3.00 3.3 0.4 GD SNAS 2.85 2.8 1.5 GD GDAS (Dong & Yang, 2019) 2.93 2.5 0.2 GD P-DARTS 2.50 3.4 0.3 GD FairDARTS (Chu et al, 2019b) 2.54 2.8 0.4 GD PC-DARTS 2.57 ± 0.07 3.6 0.1 GD DropNAS (Hong et al, 2020) 2.58 ± 0.14 4.1 0.6 GD MergeNAS 2.73 ± 0.02 2.9 0.2 GD ASAP (Noy et al, 2020) 2.68 ± 0.11 2.5 0.2 GD SDARTS-ADV (Chen & Hsieh, 2020) 2.61 ± 0.02 3.3 1.3 GD DARTS- (Chu et al, 2020) 2.59 ± 0.08 3.5 0.4 GD DOTS 2.45 ± 0.04 4.2 0.2 GD weight α and edge combination weight β with the exponential schedule defined in Eq. 5.…”
Section: Edge Annealing Topology Searchmentioning
confidence: 99%
“…Our method can implement the similar idea by aggregating sub-graphs. Methods from different searching indicators [7,56,4] are also considered in our experiments. We also included the traditional handcraft models like ResNet [17], DenseNet [22], MobileNet [20,40,19] and ShuffleNet [29].…”
Section: Compared Methodsmentioning
confidence: 99%
“…Besides, DARTS+ [25] and DARTS-ES [54] show that DARTS tends to collapse and overfit due to the unbalanced competition between model parameters and architecture parameters. Other line of works like DARTS- [7], TE-NAS [56] and RLNAS [4] use different performance indicators rather than validation loss to prevent searching collapse. PC-DARTS [48] reduces the memory usage by performing architecture search in a randomly sampled subset of operations.…”
Section: Related Workmentioning
confidence: 99%
“…Neural Architecture Search. NAS aims at finding the optimal nerural architectures specific to dataset [3,8,11,24,27,29,35,42,50]. Search space and search strategy are the most essential components in NAS.…”
Section: Related Workmentioning
confidence: 99%