2018
DOI: 10.1142/s1469026818500086
|View full text |Cite
|
Sign up to set email alerts
|

Speeding up the Hyperparameter Optimization of Deep Convolutional Neural Networks

Abstract: Most learning algorithms require the practitioner to manually set the values of many hyperparameters before the learning process can begin. However, with modern algorithms, the evaluation of a given hyperparameter setting can take a considerable amount of time and the search space is often very high-dimensional. We suggest using a lower-dimensional representation of the original data to quickly identify promising areas in the hyperparameter space. This information can then be used to initialize the optimizatio… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
44
0
1

Year Published

2019
2019
2021
2021

Publication Types

Select...
4
3
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 102 publications
(53 citation statements)
references
References 22 publications
0
44
0
1
Order By: Relevance
“…Methods for details). Tuning hyperparameters in deep neural networks, especially in complex models such as GANs, can be computationally intensive [60], [61]. Thus, it is quite common in deep learning research to perform one-fold cross-validation [30], [35] or even directly adopt hyperparameter selection from published work [24], [28], [29], [38], [48], [62].…”
Section: Network Trainingmentioning
confidence: 99%
“…Methods for details). Tuning hyperparameters in deep neural networks, especially in complex models such as GANs, can be computationally intensive [60], [61]. Thus, it is quite common in deep learning research to perform one-fold cross-validation [30], [35] or even directly adopt hyperparameter selection from published work [24], [28], [29], [38], [48], [62].…”
Section: Network Trainingmentioning
confidence: 99%
“…In contrast, Bayesian optimization selects the next sampled hyperparameters based on previous evaluations. This has proven more efficient in terms of balancing exploration-8 H. Soto & B. Schurr exploitation of the search space, time consumption, and model performance results, compared to random search (Bergstra et al 2013a;Hinz et al 2018).…”
Section: Hyperparameter Optimization Of Adaptive Neural Networkmentioning
confidence: 99%
“…Most of such techniques are based on learning curve extrapolation [25] and surrogate models using RNN predictor [52] that aim at predicting and eliminating poor architectures before full training. Another idea to estimate performance and rank designed architectures is to use simplified (proxy) metrics for training such as data subsets (mini-batches) [29] and down-sampled data (like images with lower resolution) [53].…”
Section: Architecture Search Acceleratorsmentioning
confidence: 99%