Theoretical analysis of batch and on-line training for gradient descent learning in neural networks

Nakama, Takéhiko

doi:10.1016/j.neucom.2009.05.017

Cited by 52 publications

(15 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Batch update - updating weights after computing the gradient across the whole dataset - with simple back-propagation learning has been suggested to be more powerful than online learning (where weights are updated every iteration), based on theoretical considerations [31]. We tested batch update on this QSAR benchmark with learning rate set to

1 \frac{12}{italicDatasetSize}

.…”

Section: Resultsmentioning

confidence: 99%

Improving quantitative structure–activity relationship models using Artificial Neural Networks trained with dropout

Mendenhall

Meiler

2016

J Comput Aided Mol Des

View full text Add to dashboard Cite

Dropout is an Artificial Neural Network (ANN) training technique that has been shown to improve ANN performance across canonical machine learning (ML) datasets. Quantitative Structure Activity Relationship (QSAR) datasets used to relate chemical structure to biological activity in Ligand-Based Computer-Aided Drug Discovery (LB-CADD) pose unique challenges for ML techniques, such as heavily biased dataset composition, and relatively large number of descriptors relative to the number of actives. To test the hypothesis that dropout also improves QSAR ANNs, we conduct a benchmark on nine large QSAR datasets. Use of dropout improved both Enrichment false positive rate (FPR) and log-scaled area under the receiver-operating characteristic curve (logAUC) by 22–46% over conventional ANN implementations. Optimal dropout rates are found to be a function of the signal-to-noise ratio of the descriptor set, and relatively independent of the dataset. Dropout ANNs with 2D and 3D autocorrelation descriptors outperform conventional ANNs as well as optimized fingerprint similarity search methods.

show abstract

1 \frac{12}{italicDatasetSize}

.…”

Section: Resultsmentioning

confidence: 99%

Improving quantitative structure–activity relationship models using Artificial Neural Networks trained with dropout

Mendenhall

Meiler

2016

J Comput Aided Mol Des

View full text Add to dashboard Cite

show abstract

“…Whereas, most of the other second-order gradient-descent methods and metaheuristic algorithms can only use batch mode training. However, a batch mode (offline) training of an FNN can at least guarantee a local minima under a simple condition compared to a stochastic/online training, and for a larger dataset, batch mode training can be faster than stochastic training [86].…”

Section: Comments On Conventional Approachesmentioning

confidence: 99%

Metaheuristic design of feedforward neural networks: A review of two decades of research

Ojha

Abraham

Snel

2017

Engineering Applications of Artificial Intelligence

441

154

View full text Add to dashboard Cite

Over the past two decades, the feedforward neural network (FNN) optimization has been a key interest among the researchers and practitioners of multiple disciplines. The FNN optimization is often viewed from the various perspectives: the optimization of weights, network architecture, activation nodes, learning parameters, learning environment, etc. Researchers adopted such different viewpoints mainly to improve the FNN's generalization ability. The gradient-descent algorithm such as backpropagation has been widely applied to optimize the FNNs. Its success is evident from the FNN's application to numerous real-world problems. However, due to the limitations of the gradient-based optimization methods, the metaheuristic algorithms including the evolutionary algorithms, swarm intelligence, etc., are still being widely explored by the researchers aiming to obtain generalized FNN for a given problem. This article attempts to summarize a broad spectrum of FNN optimization methodologies including conventional and metaheuristic approaches. This article also tries to connect various research directions emerged out of the FNN optimization practices, such as evolving neural network (NN), cooperative coevolution NN, complex-valued NN, deep learning, extreme learning machine, quantum NN, etc. Additionally, it provides interesting research challenges for future research to cope-up with the present information processing era.

show abstract

“…The calculate process from the input layer to the output layer is called forward propagation. It calculates the cost function of the classification result and the ground truth types, and hopes that it is as small as possible, which makes the classifier's predictions as close as possible to the real land In deep learning, a variety of gradient descent methods have been developed based on the gradient descent, such as stochastic gradient descent (SGD) [52], batch gradient descent (BGD) [53] and mini-batch gradient descent (MBGD) [54]. The difference between them is that each training session takes one training sample, all training samples and a certain amount of training samples for training the CNN model.…”

Section: F-r Cnn Modelmentioning

confidence: 99%

A Convolutional Neural Network with Fletcher–Reeves Algorithm for Hyperspectral Image Classification

Chen

Ren

2019

Remote Sensing

View full text Add to dashboard Cite

Deep learning models, especially the convolutional neural networks (CNNs), are very active in hyperspectral remote sensing image classification. In order to better apply the CNN model to hyperspectral classification, we propose a CNN model based on Fletcher–Reeves algorithm (F–R CNN), which uses the Fletcher–Reeves (F–R) algorithm for gradient updating to optimize the convergence performance of the model in classification. In view of the fact that there are fewer optional training samples in practical applications, we further propose a method of increasing the number of samples by adding a certain degree of perturbed samples, which can also test the anti-interference ability of classification methods. Furthermore, we analyze the anti-interference and convergence performance of the proposed model in terms of different training sample data sets, different batch training sample numbers and iteration time. In this paper, we describe the experimental process in detail and comprehensively evaluate the proposed model based on the classification of CHRIS hyperspectral imagery covering coastal wetlands, and further evaluate it on a commonly used hyperspectral image benchmark dataset. The experimental results show that the accuracy of the two models after increasing training samples and adjusting the number of batch training samples is improved. When the number of batch training samples is continuously increased to 350, the classification accuracy of the proposed method can still be maintained above 80.7%, which is 2.9% higher than the traditional one. And its time consumption is less than that of the traditional one while ensuring classification accuracy. It can be concluded that the proposed method has anti-interference ability and outperforms the traditional CNN in terms of batch computing adaptability and convergence speed.

show abstract

Theoretical analysis of batch and on-line training for gradient descent learning in neural networks

Cited by 52 publications

References 8 publications

Improving quantitative structure–activity relationship models using Artificial Neural Networks trained with dropout

Improving quantitative structure–activity relationship models using Artificial Neural Networks trained with dropout

Metaheuristic design of feedforward neural networks: A review of two decades of research

A Convolutional Neural Network with Fletcher–Reeves Algorithm for Hyperspectral Image Classification

Contact Info

Product

Resources

About