2009
DOI: 10.1016/j.neucom.2009.05.017
|View full text |Cite
|
Sign up to set email alerts
|

Theoretical analysis of batch and on-line training for gradient descent learning in neural networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2011
2011
2021
2021

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 52 publications
(15 citation statements)
references
References 8 publications
0
13
0
Order By: Relevance
“…Batch update - updating weights after computing the gradient across the whole dataset - with simple back-propagation learning has been suggested to be more powerful than online learning (where weights are updated every iteration), based on theoretical considerations [31]. We tested batch update on this QSAR benchmark with learning rate set to 112italicDatasetSize.…”
Section: Resultsmentioning
confidence: 99%
“…Batch update - updating weights after computing the gradient across the whole dataset - with simple back-propagation learning has been suggested to be more powerful than online learning (where weights are updated every iteration), based on theoretical considerations [31]. We tested batch update on this QSAR benchmark with learning rate set to 112italicDatasetSize.…”
Section: Resultsmentioning
confidence: 99%
“…Whereas, most of the other second-order gradient-descent methods and metaheuristic algorithms can only use batch mode training. However, a batch mode (offline) training of an FNN can at least guarantee a local minima under a simple condition compared to a stochastic/online training, and for a larger dataset, batch mode training can be faster than stochastic training [86].…”
Section: Comments On Conventional Approachesmentioning
confidence: 99%
“…The calculate process from the input layer to the output layer is called forward propagation. It calculates the cost function of the classification result and the ground truth types, and hopes that it is as small as possible, which makes the classifier's predictions as close as possible to the real land In deep learning, a variety of gradient descent methods have been developed based on the gradient descent, such as stochastic gradient descent (SGD) [52], batch gradient descent (BGD) [53] and mini-batch gradient descent (MBGD) [54]. The difference between them is that each training session takes one training sample, all training samples and a certain amount of training samples for training the CNN model.…”
Section: F-r Cnn Modelmentioning
confidence: 99%