DARTS-: Robustly Stepping out of Performance Collapse Without Indicators

Chu, Xiangxiang; Wang, Xiaoxing; Zhang, Bo; Lu, Shun; Wei, Xiaolin; Yan, Junchi

doi:10.48550/arxiv.2009.01027

Cited by 14 publications

(32 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In a word, we encourage fair competition within each group and avoid unfair competition between groups, making operation search more stable. The candidate operations searched by group annealing operation search are listed in ResNet (He et al, 2016) 22.10 † 1.7 N/A N/A DenseNet-BC (Huang et al, 2017) 17.18 † 25.6 N/A N/A AmoebaNet (Real et al, 2019) 18.93 † 3.1 3150 EA PNAS (Liu et al, 2018a) 19.53 † 3.2 150 SMBO ENAS (Pham et al, 2018) 19.43 † 4.6 0.45 RL DARTS (2nd) 17.54 † 3.4 1 GD GDAS (Dong & Yang, 2019) 18.38 † 3.4 0.2 GD P-DARTS 17.49 † 3.6 0.3 GD DropNAS (Hong et al, 2020) 16.39 ‡ 4.4 0.7 GD DARTS- (Chu et al, 2020) 17.16 ‡ 3.4 0.4 GD DOTS 15.75 4.2 0.2 GD…”

Section: B Discussion About the Differences Between Darts And Dotsmentioning

confidence: 99%

“…where γ (i,j) = c∈Ex j ,(i,j)∈c β xj c aggregates the weight for edge (i, j) from all its combinations, as shown in Figure 2. To bridge the optimization gap in topology search, we anneal the operation (Pham et al, 2018) 2.89 4.6 0.5 RL NAONet-WS (Luo et al, 2018) 3.53 3.1 0.4 NAO AmoebaNet-B (Real et al, 2019) 2.55± 0.05 2.8 3150 EA Hireachical Evolution (Liu et al, 2018b) 3.75± 0.12 15.7 300 EA PNAS (Liu et al, 2018a) 3.41± 0.09 3.2 225 SMBO DARTS 3.00 3.3 0.4 GD SNAS 2.85 2.8 1.5 GD GDAS (Dong & Yang, 2019) 2.93 2.5 0.2 GD P-DARTS 2.50 3.4 0.3 GD FairDARTS (Chu et al, 2019b) 2.54 2.8 0.4 GD PC-DARTS 2.57 ± 0.07 3.6 0.1 GD DropNAS (Hong et al, 2020) 2.58 ± 0.14 4.1 0.6 GD MergeNAS 2.73 ± 0.02 2.9 0.2 GD ASAP (Noy et al, 2020) 2.68 ± 0.11 2.5 0.2 GD SDARTS-ADV (Chen & Hsieh, 2020) 2.61 ± 0.02 3.3 1.3 GD DARTS- (Chu et al, 2020) 2.59 ± 0.08 3.5 0.4 GD DOTS 2.45 ± 0.04 4.2 0.2 GD weight α and edge combination weight β with the exponential schedule defined in Eq. 5.…”

Section: Edge Annealing Topology Searchmentioning

confidence: 99%

See 1 more Smart Citation

DOTS: Decoupling Operation and Topology in Differentiable Architecture Search

Gu¹,

Liu²,

Yang³

et al. 2020

Preprint

View full text Add to dashboard Cite

Differentiable Architecture Search (DARTS) has attracted extensive attention due to its efficiency in searching for cell structures. However, DARTS mainly focuses on the operation search, leaving the cell topology implicitly depending on the searched operation weights. Hence, a problem is raised: can cell topology be well represented by the operation weights? The answer is negative because we observe that the operation weights fail to indicate the performance of cell topology. In this paper, we propose to Decouple the Operation and Topology Search (DOTS), which decouples the cell topology representation from the operation weights to make an explicit topology search. DOTS is achieved by defining an additional cell topology search space besides the original operation search space. Within the DOTS framework, we propose group annealing operation search and edge annealing topology search to bridge the optimization gap between the searched over-parameterized network and the derived child network. DOTS is efficient and only costs 0.2 and 1 GPU-day to search the state-of-the-art cell architectures on CIFAR and ImageNet, respectively. By further searching for the topology of DARTS' searched cell, we can improve DARTS' performance significantly. The code will be publicly available.

show abstract

Section: B Discussion About the Differences Between Darts And Dotsmentioning

confidence: 99%

Section: Edge Annealing Topology Searchmentioning

confidence: 99%

DOTS: Decoupling Operation and Topology in Differentiable Architecture Search

Gu¹,

Liu²,

Yang³

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…Our method can implement the similar idea by aggregating sub-graphs. Methods from different searching indicators [7,56,4] are also considered in our experiments. We also included the traditional handcraft models like ResNet [17], DenseNet [22], MobileNet [20,40,19] and ShuffleNet [29].…”

Section: Compared Methodsmentioning

confidence: 99%

“…Besides, DARTS+ [25] and DARTS-ES [54] show that DARTS tends to collapse and overfit due to the unbalanced competition between model parameters and architecture parameters. Other line of works like DARTS- [7], TE-NAS [56] and RLNAS [4] use different performance indicators rather than validation loss to prevent searching collapse. PC-DARTS [48] reduces the memory usage by performing architecture search in a randomly sampled subset of operations.…”

Section: Related Workmentioning

confidence: 99%

Mutually-aware Sub-Graphs Differentiable Architecture Search

Tan¹,

Guo

Zhong³

et al. 2021

Preprint

View full text Add to dashboard Cite

Differentiable architecture search is prevalent in the field of NAS because of its simplicity and efficiency, where two paradigms, multi-path algorithms and single-path methods, are dominated. Multi-path framework (e.g. DARTS) is intuitive but suffers from memory usage and training collapse. Single-path methods (e.g. GDAS and Proxyless-NAS) mitigate the memory issue and shrink the gap between searching and evaluation but sacrifice the performance. In this paper, we propose a conceptually simple yet efficient method to bridge these two paradigms, referred as Mutually-aware Sub-Graphs Differentiable Architecture Search (MSG-DAS). The core of our framework is a differentiable Gumbel-TopK sampler that produces multiple mutually exclusive single-path sub-graphs. To alleviate the severer skip-connect issue brought by multiple sub-graphs setting, we propose a Dropblock-Identity module to stabilize the optimization. To make best use of the available models (super-net and sub-graphs), we introduce a memoryefficient super-net guidance distillation to improve training. The proposed framework strikes a balance between flexible memory usage and searching quality. We demonstrate the effectiveness of our methods on ImageNet and CIFAR10, where the searched models show a comparable performance as the most recent approaches.

show abstract

“…Neural Architecture Search. NAS aims at finding the optimal nerural architectures specific to dataset [3,8,11,24,27,29,35,42,50]. Search space and search strategy are the most essential components in NAS.…”

Section: Related Workmentioning

confidence: 99%

Edge-featured Graph Neural Architecture Search

Cai¹,

Zhang²,

Han³

et al. 2021

Preprint

View full text Add to dashboard Cite

Graph neural networks (GNNs) have been successfully applied to learning representation on graphs in many relational tasks. Recently, researchers study neural architecture search (NAS) to reduce the dependence of human expertise and explore better GNN architectures, but they over-emphasize entity features and ignore latent relation information concealed in the edges. To solve this problem, we incorporate edge features into graph search space and propose Edge-featured Graph Neural Architecture Search (EGNAS) to find the optimal GNN architecture. Specifically, we design rich entity and edge updating operations to learn high-order representations, which convey more generic message passing mechanisms. Moreover, the architecture topology in our search space allows to explore complex feature dependence of both entities and edges, which can be efficiently optimized by differentiable search strategy. Experiments at three graph tasks on six datasets show EGNAS can search better GNNs with higher performance than current state-of-the-art human-designed and searched-based GNNs. 2 * Corresponding author. 2 Codes have been provided in supplementary materials, and will be released via GitHub.Preprint. Under review.

show abstract

DARTS-: Robustly Stepping out of Performance Collapse Without Indicators

Cited by 14 publications

References 15 publications

DOTS: Decoupling Operation and Topology in Differentiable Architecture Search

DOTS: Decoupling Operation and Topology in Differentiable Architecture Search

Mutually-aware Sub-Graphs Differentiable Architecture Search

Edge-featured Graph Neural Architecture Search

Contact Info

Product

Resources

About