2019
DOI: 10.48550/arxiv.1905.04919
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

BayesNAS: A Bayesian Approach for Neural Architecture Search

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
15
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(15 citation statements)
references
References 25 publications
0
15
0
Order By: Relevance
“…Architecture Space The DARTs operation space O contains eight choices: none (zero), skip connection, separable convolution 3 × 3 and 5 × 5, dilated separable convolution 3 × 3 and 5 × 5, max pooling 3 × 3, average pooling 3 × 3. Following previous works (Liu et al, 2018b;Chen et al, 2019;Xu et al, 2019), for evaluation phases, we stack 20 cells to compose the network and set the (Liu et al, 2018a) 3.41(0.09) 3.2 225 SMBO ENAS (Pham et al, 2018) 2.89 4.6 0.5 RL NASNet-A 2.65 3.3 2000 RL DARTS (1st) (Liu et al, 2018b) 3.00(0.14) 3.3 0.4 gradient DARTS (2nd) (Liu et al, 2018b) 2.76(0.09) 3.3 1.0 gradient SNAS (Xie et al, 2018) 2.85(0.02) 2.8 1.5 gradient GDAS (Dong & Yang, 2019) 2.82 2.5 0.17 gradient BayesNAS (Zhou et al, 2019) 2.81(0.04) 3.4 0.2 gradient ProxylessNAS (Cai et al, 2018) † 2.08 5.7 4.0 gradient P-DARTS (Chen et al, 2019) 2.50 3.4 0.3 gradient PC-DARTS (Xu et al, 2019) 2.57(0.07) 3.6 0.1 gradient SDARTS-ADV (Chen & Hsieh, 2020) 2.61(0.02) (Han et al, 2017) as the backbone. ‡ Recorded on a single GTX 1080Ti GPU.…”
Section: Results On Cifar-10 With Darts Search Spacementioning
confidence: 99%
See 1 more Smart Citation
“…Architecture Space The DARTs operation space O contains eight choices: none (zero), skip connection, separable convolution 3 × 3 and 5 × 5, dilated separable convolution 3 × 3 and 5 × 5, max pooling 3 × 3, average pooling 3 × 3. Following previous works (Liu et al, 2018b;Chen et al, 2019;Xu et al, 2019), for evaluation phases, we stack 20 cells to compose the network and set the (Liu et al, 2018a) 3.41(0.09) 3.2 225 SMBO ENAS (Pham et al, 2018) 2.89 4.6 0.5 RL NASNet-A 2.65 3.3 2000 RL DARTS (1st) (Liu et al, 2018b) 3.00(0.14) 3.3 0.4 gradient DARTS (2nd) (Liu et al, 2018b) 2.76(0.09) 3.3 1.0 gradient SNAS (Xie et al, 2018) 2.85(0.02) 2.8 1.5 gradient GDAS (Dong & Yang, 2019) 2.82 2.5 0.17 gradient BayesNAS (Zhou et al, 2019) 2.81(0.04) 3.4 0.2 gradient ProxylessNAS (Cai et al, 2018) † 2.08 5.7 4.0 gradient P-DARTS (Chen et al, 2019) 2.50 3.4 0.3 gradient PC-DARTS (Xu et al, 2019) 2.57(0.07) 3.6 0.1 gradient SDARTS-ADV (Chen & Hsieh, 2020) 2.61(0.02) (Han et al, 2017) as the backbone. ‡ Recorded on a single GTX 1080Ti GPU.…”
Section: Results On Cifar-10 With Darts Search Spacementioning
confidence: 99%
“…Searching on ImageNet takes a longer time than on CIFAR-10 due to the larger input size and more network parameters. (Real et al, 2019) 24.3 7.6 6.4 3150 evolution PNAS (Liu et al, 2018a) 25.8 8.1 5.1 225 SMBO MnasNet-92 (Tan et al, 2019) 25.2 8.0 4.4 -RL DARTS (2nd) (Liu et al, 2018b) 26.7 8.7 4.7 4.0 gradient SNAS (mild) (Xie et al, 2018) 27.3 9.2 4.3 1.5 gradient GDAS (Dong & Yang, 2019) 26.0 8.5 5.3 0.21 gradient BayesNAS (Zhou et al, 2019) 26.5 8.9 3.9 0.2 gradient P-DARTS (CIFAR-10) (Chen et al, 2019) 24.4 7.4 4.9 0.3 gradient P-DARTS (CIFAR-100) (Chen et al, 2019) 24.7 7.5 5.1 0.3 gradient PC-DARTS (CIFAR-10) (Xu et al, 2019) 25…”
Section: Results On Imagenet With Darts Search Spacementioning
confidence: 99%
“…NAS is usually time consuming. We have seen great improvements from 24, 000 GPU-days [26] to 0.2 GPUday [23]. The trick is to first construct a supernet containing the complete search space and train the candidates all at once with bi-level optimization and efficient weight sharing [12].…”
Section: Neural Architecture Searchmentioning
confidence: 99%
“…Despite the inspiring success of NAS, the search space of conventional NAS algorithms is extremely large, leading the exhaustive search for the optimal network will be computation prohibited. To accommodate the searching budget, heuristic searching methods are usually leveraged and can be mainly categorized into reinforcement learning-based [25,26], evolution-based [9,35], Bayesian optimization-based [37,29], and gradient-based methods [19,1,34,14].…”
Section: Introductionmentioning
confidence: 99%
“…Though existing one-shot NAS methods have achieved impressive performance, they often consider each layer separately while ignoring the dependencies between the operation choices on different layers, which leads to an inaccurate description and evaluation of the neural architectures during the search. For example, Gaussian Processes (GP) in Bayesian optimization requires that the input attributes (OPs) are independent of each other [37,29], and the cross mutations of OPs in evolutionary search are often carried out separately in each layer [9,35]. In fact, for a feedforward neural network, the choice of a specific layer relates to its previous layers and contributes to its post layers.…”
Section: Introductionmentioning
confidence: 99%