Ayla Gülcü scite author profile

In this study, we provide a review on the meta-heuristic methods like Genetic Algorithms, Particle Swarm Optimization, Differential Evolution and Bayes Optimization that have been used extensively to optimize hyper-parameters in Convolutional Neural Networks (CNN). We highlight the hyper-parameters that have been selected to be optimized in those studies along with the value domains of those parameters. These studies reveal that the number of layers, number of kernels and size of those kernels at each layer, learning rate and the batch size are among the hyper-parameters that affect the performance of the CNNs the most. Figure A. structure of convolutional neural networks Purpose: In this study, meta-heuristic methods that have been used to optimize convolutional neural networks are investigated. A performance comparison of these methods on different image datasets has been presented. The advantages and disadvantages of the optimization approaches have been presented with the aim of providing the user important points that should be considered during hyper-parameter selection process. Results: The definiton of "the best" set of hyper-parameters in convolutional neural networks depends on the problem or in this case, on the dataset. But it is clear from the studies that the selection of some parameters directly affect the performance of the networks. Number of layers, number of filters in each layer and size of each filter, regularization method, learning rate and batch size are among the most important parameters. It is easy to conclude that Genetic Algorithms (GA) are the most widely studied techniques used in hyper-parameter optimizaton. This is due to the fact that they yield successful results in most of the studies. While selecting the optimization method, one should consider the size of the problem, available computational budget and time. In addition, accuracy expectations should also be taken into account. For the problems with small hyper-parameter search space, methods like Grid Search would be sufficient, but for the problems with large search space, meta-heuristic methods would be more convenient. Conclusion: In this study, the effect of hyper-parameter optimization methods on classification performance is investigated. GA and Particle Swarm Optimization (PSO) methods are the two most-widely used meta-heuristics for hyper-parameter optimization. The computational burden of these methods can be justified with the accuracy improvement achieved with them. If the computational resources are limited, and it is desired to obtain good results in reasonable amount of time, then other methods like TPE and SMAC would be good choices.

show abstract

Hyper-Parameter Selection in Convolutional Neural Networks Using Microcanonical Optimization Algorithm

Gülcü

Kuş

2020

IEEE Access

View full text Add to dashboard Cite

The success of Convolutional Neural Networks is highly dependent on the selected architecture and the hyper-parameters. The need for the automatic design of the networks is especially important for complex architectures where the parameter space is so large that trying all possible combinations is computationally infeasible. In this study, Microcanonical Optimization algorithm which is a variant of Simulated Annealing method is used for hyper-parameter optimization and architecture selection for Convolutional Neural Networks. To the best of our knowledge, our study provides a first attempt at applying Microcanonical Optimization for this task. The networks generated by the proposed method is compared to the networks generated by Simulated Annealing method in terms of both accuracy and size using six widely-used image recognition datasets. Moreover, a performance comparison using Tree Parzen Estimator which is a Bayesion optimization-based approach is also presented. It is shown that the proposed method is able to achieve competitive classification results with the state-of-the-art architectures. When the size of the networks is also taken into account, one can see that the networks generated by Microcanonical Optimization method contain far less parameters than the state-of-the-art architectures. Therefore, the proposed method can be preferred for automatically tuning the networks especially in situations where fast training is as important as the accuracy. INDEX TERMS Convolutional neural networks, hyper-parameter optimization, microcanonical optimization, tree Parzen estimator.

show abstract

Robust university course timetabling problem subject to single and multiple disruptions

Gülcü

Akkan

2020

European Journal of Operational Research

View full text Add to dashboard Cite

Multi-objective simulated annealing for hyper-parameter optimization in convolutional neural networks

Gülcü¹,

Kuş²

2021

View full text Add to dashboard Cite

In this study, we model a CNN hyper-parameter optimization problem as a bi-criteria optimization problem, where the first objective being the classification accuracy and the second objective being the computational complexity which is measured in terms of the number of floating point operations. For this bi-criteria optimization problem, we develop a Multi-Objective Simulated Annealing (MOSA) algorithm for obtaining high-quality solutions in terms of both objectives. CIFAR-10 is selected as the benchmark dataset, and the MOSA trade-off fronts obtained for this dataset are compared to the fronts generated by a single-objective Simulated Annealing (SA) algorithm with respect to several front evaluation metrics such as generational distance, spacing and spread. The comparison results suggest that the MOSA algorithm is able to search the objective space more effectively than the SA method. For each of these methods, some front solutions are selected for longer training in order to see their actual performance on the original test set. Again, the results state that the MOSA performs better than the SA under multi-objective setting. The performance of the MOSA configurations are also compared to other search generated and human designed state-of-the-art architectures. It is shown that the network configurations generated by the MOSA are not dominated by those architectures, and the proposed method can be of great use when the computational complexity is as important as the test accuracy.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ayla Gülcü

A bi-criteria hybrid Genetic Algorithm with robustness objective for the course timetabling problem

Konvolüsyonel Sinir Ağlarında Hiper-Parametre Optimizasyonu Yöntemlerinin İncelenmesi

Hyper-Parameter Selection in Convolutional Neural Networks Using Microcanonical Optimization Algorithm

Robust university course timetabling problem subject to single and multiple disruptions

Multi-objective simulated annealing for hyper-parameter optimization in convolutional neural networks

Contact Info

Product

Resources

About