Analysis of the back-propagation algorithm with momentum

Phansalkar, V.V.; Sastry, P. S.

doi:10.1109/72.286925

Cited by 143 publications

(59 citation statements)

References 1 publication

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Much of the existing analysis in the neural network literature (Haykin, 1999;Phansalkar & Sastry, 1994;Qian, 1999;Torii & Hagan 2002) is restricted to the case of constant l and m: There is also a large literature on the time-varying case, generally referred to as dynamic or adaptive choice of the learning rate and momentum factors (see Kamarthi and Pittner (1999) and references therein), but, to the best of our knowledge, the observations made in this paper are new. We now present the CG method from a control viewpoint, which is the inspiration for the results obtained here.…”

Section: Steepest Descent Plus Momentum Equals Frozen Conjugate Gradientmentioning

confidence: 95%

“…For brevity, this note will focus on the contributions of Qian (1999) and Torii and Hagan (2002) which are recent and clearly written time-invariant analyses of the BPM method, which has been extensively analyzed, both theoretically and experimentally (see, for example, Hagiwara and Sato (1995), Kamarthi and Pittner (1999), Phansalkar and Sastry (1994), Yu and Chen (1997), and Yu, Chen, and Cheng (1995) and references therein).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method

Bhaya

Kaszkurewicz

2004

Neural Networks

View full text Add to dashboard Cite

It is pointed out that the so called momentum method, much used in the neural network literature as an acceleration of the backpropagation method, is a stationary version of the conjugate gradient method. Connections with the continuous optimization method known as heavy ball with friction are also made. In both cases, adaptive (dynamic) choices of the so called learning rate and momentum parameters are obtained using a control Liapunov function analysis of the system. q

show abstract

Section: Steepest Descent Plus Momentum Equals Frozen Conjugate Gradientmentioning

confidence: 95%

Section: Introductionmentioning

confidence: 99%

Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method

Bhaya

Kaszkurewicz

2004

Neural Networks

View full text Add to dashboard Cite

show abstract

“…6. The MLP-based ANNs were configured as follows: the transfer function used is the sigmoid; additional to the input and output units the topology has one hidden layer with two units; the learning rate is equal to 0.125; momentum (Phansalkar & Sastry, 1994) is used with a rate equal to 0.9. 7.…”

Section: Discussionmentioning

confidence: 99%

Artificial Neural Network for Cooperative Distributed Environments

Paletta¹

2011

Artificial Neural Networks - Application

View full text Add to dashboard Cite

“…The most useful basic training method in the area of neural networks is the backpropagation model and its variations (Martinez, Melin, Bravo, Gonzalez & Gonzalez, 2006;Cazorla & Escolano, 2003;Hagan, Demuth & Beale 1996;Phansalkar & Sastry, 1994). When these methods are applied in practical problems, the training time of the basic backpropagation model can be very high (Moller, 1993;Salazar, Melin & Castillo, 2008).…”

Section: Literature Reviewmentioning

confidence: 99%