Basic Enhancement Strategies When Using Bayesian Optimization for Hyperparameter Tuning of Deep Neural Networks

Cho, Hyunghun; Kim, Yong-Jin; Lee, Eunjung; Choi, Daeyoung; Lee, Yong Jae; Rhee, Wonjong

doi:10.1109/access.2020.2981072

Cited by 143 publications

(107 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Due to it, we consider the hyper-parameter tuning as the essential task of this research and the main goal of it is to improve the baseline approach (with the initial ANN architecture and initial hyper-parameter values chosen by the human expert according to the theoretical insights) by the significant margin. The examples of methods used for optimizing ANN hyper-parameters include various nature-inspired heuristics such as monarch butterfly optimization , swarm intelligence , Bayesian optimization (Cho et al, 2020), multi-threaded training (Połap et al, 2018), evolutionary optimization (Cui & Bai, 2019), genetic algorithm (Han et al, 2020), harmony search algorithm (Kim, Geem & Han, 2020), simulated annealing (Lima, Ferreira Junior & Oliveira, 2020), Pareto optimization (Plonis et al, 2020), gradient descent optimization of a directed acyclic graph (Zhang et al, 2020) and others.…”

Section: Introductionmentioning

confidence: 99%

Neural network hyperparameter optimization for prediction of real estate prices in Helsinki

Kalliola

Kapočiūtė-Dzikienė

Damaševičius

2021

PeerJ Computer Science

View full text Add to dashboard Cite

Accurate price evaluation of real estate is beneficial for many parties involved in real estate business such as real estate companies, property owners, investors, banks, and financial institutes. Artificial Neural Networks (ANNs) have shown promising results in real estate price evaluation. However, the performance of ANNs greatly depends upon the settings of their hyperparameters. In this paper, we apply and optimize an ANN model for real estate price prediction in Helsinki, Finland. Optimization of the model is performed by fine-tuning hyper-parameters (such as activation functions, optimization algorithms, etc.) of the ANN architecture for higher accuracy using the Bayesian optimization algorithm. The results are evaluated using a variety of metrics (RMSE, MAE, R2) as well as illustrated graphically. The empirical analysis of the results shows that model optimization improved the performance on all metrics (reaching the relative mean error of 8.3%).

show abstract

Section: Introductionmentioning

confidence: 99%

Neural network hyperparameter optimization for prediction of real estate prices in Helsinki

Kalliola

Kapočiūtė-Dzikienė

Damaševičius

2021

PeerJ Computer Science

View full text Add to dashboard Cite

show abstract

“…Whereas the basic assessments were performed using the algorithms' default hyperparameter values, we desired to observe the extent of improvement that is possible through hyperparameter optimization (HPO). For the most important hyperparameter, the learning rate, we have used diversified Bayesian optimization [39] as the choice of HPO algorithm and assessed the extra improvements. The range of learning rate for one-for-all and transfer learning was [10 −6.0 , 10 −0.2 ].…”

Section: ) Hyperparameter Optimization (Hpo)mentioning

confidence: 99%

Individualized Short-Term Electric Load Forecasting With Deep Neural Network Based Transfer Learning and Meta Learning

Lee

Rhee

2021

IEEE Access

Self Cite

View full text Add to dashboard Cite

While the general belief is that the best way to predict electric load is through individualized models, the existing studies have focused on one-for-all models because the individual models are difficult to train and require a significantly larger data accumulation time per individual. In recent years, applying deep learning for forecasting electric load has become an important research topic but still one-for-all has been the main approach. In this work, we adopt transfer learning and meta learning that can be smoothly integrated into deep neural networks, and show how a high-performance individualized model can be formed using the individual's data collected over just several days. This is made possible by extracting the common patterns of many individuals using a sufficiently large dataset, and then customizing each individual model using the specific individual's small dataset. The proposed methods are evaluated over residential and non-residential datasets. When compared to the conventional methods, the meta learning model shows 7.84% and 15.07% RMSE improvements over the residential and non-residential datasets, respectively. Our results suggest that the individualized models can be used as effective tools for many short-term load forecasting tasks.

show abstract

“…Weight updation is done using Equation ( 9), which is normalized by Equation ( 8) based on loss function optimization [22].…”

Section: Root Mean Square Propagation (Rmsprop) Optimizationmentioning

confidence: 99%

Breast Cancer Detection and Classification using Deeper Convolutional Neural Networks based on Wavelet Packet Decomposition Techniques

Rajakumari

Kalaivani

2021

Preprint

View full text Add to dashboard Cite

Breast cancer is a commonly diagnosed disease in women. Early detection, a personalized treatment approach, and better understanding are necessary for cancer patients to survive. In this work, both Deep learning Network and traditional Convolution Network has been employed for Digital Database for Screening Mammography (DDSM) dataset. In this work, breast cancer images are subjected to removal of background images followed by Weiner filtering and Contrast Limited Histogram Equalization (CLAHE) filter for image restoration. Wavelet Packet Decomposition (WPD) using Daubechies wavelet level 3 (db3) is employed to improve the smoothness of the images. In the first part of breast cancer recognition, these preprocessed images are fed to a deep convolution neural network, namely GoogleNet and AlexNetfor ADAM. RMSPROP and SGDM optimizers for different learning rates such as 0.01, 0.001, and 0.0001. As medical image necessitates discriminative features for classification, the pre-trained GoogleNet architectures extract the complicated features from the image and increase the recognition rate. In the latter part of this paper, Particle Swarm Optimization based Multi-Layer Perceptron (PSO-MLP), and Ant Colony Optimization based Multi-Layer Perceptron (ACO-MLP) are employed for breast cancer recognition using statistical features like skewness, kurtosis, variance, entropy, contrast, correlation, energy, homogeneity, mean which are extracted from the preprocessed image. The performance of this GoogleNet has been compared with AlexNet, PSO-MLP, and ACO-MLP in terms of accuracy, loss rate, and runtime and achieves an accuracy of 99%, with less loss rate of 0.1547 and the lowest run time of 4.14 minutes.

show abstract

Basic Enhancement Strategies When Using Bayesian Optimization for Hyperparameter Tuning of Deep Neural Networks

Cited by 143 publications

References 15 publications

Neural network hyperparameter optimization for prediction of real estate prices in Helsinki

Neural network hyperparameter optimization for prediction of real estate prices in Helsinki

Individualized Short-Term Electric Load Forecasting With Deep Neural Network Based Transfer Learning and Meta Learning

Breast Cancer Detection and Classification using Deeper Convolutional Neural Networks based on Wavelet Packet Decomposition Techniques

Contact Info

Product

Resources

About