Rescaling of variables in back propagation learning

Rigler, A. K.; Irvine, John; Vogl, Thomas P.

doi:10.1016/0893-6080(91)90006-q

Cited by 81 publications

(25 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…When this type of error is generated by overflow, mostly occurred during the calculation of the term net l j , is not crucial due to the characteristics of the sigmoid neuron model. However the underflow error that occurs during the calculation of the backpropagating error signal [Rumelhart et al, 1986] denoted by δ j is very crucial and can lead to nonconvergence and saturation [Rigler et al, 1991]. Despite the fact that the error associated with neuron j can be significant, δ j may become negligible and rounded to zero if the derivative of the activation function f (net l j ) is very small.…”

Section: Imprecision Incurred In Supervised Trainingmentioning

confidence: 99%

Adaptive Algorithms for Neural Network Supervised Learning: A Deterministic Optimization Approach

Magoulas

Vrahatis

2006

Int. J. Bifurcation Chaos

View full text Add to dashboard Cite

Networks of neurons can perform computations that even modern computers find very difficult to simulate. Most of the existing artificial neurons and artificial neural networks are considered biologically unrealistic, nevertheless the practical success of the backpropagation algorithm and the powerful capabilities of feedforward neural networks have made neural computing very popular in several application areas. A challenging issue in this context is learning internal representations by adjusting the weights of the network connections. To this end, several firstorder and second-order algorithms have been proposed in the literature. This paper provides an overview of approaches to backpropagation training, emphazing on first-order adaptive learning algorithms that build on the theory of nonlinear optimization, and proposes a framework for their analysis in the context of deterministic optimization.

show abstract

Section: Imprecision Incurred In Supervised Trainingmentioning

confidence: 99%

Adaptive Algorithms for Neural Network Supervised Learning: A Deterministic Optimization Approach

Magoulas

Vrahatis

2006

Int. J. Bifurcation Chaos

View full text Add to dashboard Cite

show abstract

“…These steps are usually constraint by problem-dependent heuristic parameters to ensure subminimization of the error function in each weight direction and hopefully obtain monotone error reduction. However, enforcing monotone error reduction using inappropriate values for the heuristic learning parameters can considerably slow the rate of training, or even lead to divergence and to premature saturation [11,19]. Moreover, it seems that using heuristics it is not possible to develop globally convergent training algorithms, i.e.…”

Section: Global Convergence By Adapting the Search Directionmentioning

confidence: 99%

Development and convergence analysis of training algorithms with local learning rate adaptation

Magoulas

Plagianakos

Vrahatis

2000

Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challeng

View full text Add to dashboard Cite

Abstract-A new theorem for the development and convergence analysis of supervised training algorithms with an adaptive learning rate for each weight is presented. Based on this theoretical result, a strategy is proposed to automatically adapt the search direction, as well as the stepsize length along the resultant search direction. This strategy is applied to some well known local learning algorithms to investigate its effectiveness.

show abstract

“…Consequently, using I,-norm with p < 2 can treat I e(n) I in a 'robust' manner as discussed in the last section. Without loss of generality, the output is assumed to be uniformly distributed in [ -1, l] for simplicity [19] in the following analyses. Having this, 1 e(n) I will be uniformly distributed in [O, 21.…”

Section: Performance Analysis Of the I-norm Algorithmmentioning

confidence: 99%

A polynomial-perceptron based decision feedback equalizer with a robust learning algorithm

Chang

Siu²,

Wei

1995

Signal Processing

View full text Add to dashboard Cite

A new equalization scheme, including a decision feedback equalizer (DFE) equipped with polynomial-perceptron model of nonlinearities and a robust learning algorithm using lp-norm error criterion with p < 2, is presented in this paper. This equalizer exerts the benefit of using a DFE and achieves the required nonlinearities in a single-layer net. This makes it easier to train by a stochastic gradient algorithm in comparison with a multi-layer net. The algorithm is robust to aberrant noise for the addressed equalizer and, hence, converges much faster in comparison with the /,-norm. A detailed performance analysis considering possible numerical problem for p < 1 is given in this paper. Computer simulations show that the scheme has faster convergence rate and satisfactory bit error rate (BER) performance. It also shows that the new equalizer is capable of approaching the performance achieved by a minimum BER equalizer.Ein neues Entzerrungsverfahren wird in diesem Beitrag vorgestellt. Es schlief3t einen Entzerrer mit Entscheidungsriickfiihrung ein, der mit einem Polynom-Perceptronmodell fiir Nichtlinearitlten und einem robusten Lernalgorithmus ausgestattet ist, welcher mit einem I,,-Norm-Kriterium mit p < 2 arbeitet. Dieser Entzerrer nutzt den Vorteil der Entscheidungsriickkopplung und erzielt die gewiinschten Nichtlinearitgten in einem Netzwerk mit einer Ebene. Gegeniiber einem mehrlagigen Netzwerk wird so das Training durch ein stochastisches Gradientenverfahren erleichtert. Der Algorithmus ist beim angesprochenen Entzerrer robust gegeniiber Fehlerrauschen und konvergiert daher vie1 schneller als bei Verwendung der I,-Norm. Die Leistungsftihigkeit wird beziiglich maglicher numerischer Probleme fiir p < 1 im einzelnen analysiert. Rechnersimulationen zeigen die h&here Konvergenzgeschwindigkeit und zufriedenstellende Bitfehlerrate. Es zeigt sich aul3erdem, daLi der neue Entzerrer fihig ist, die Leistungsfihigkeit einer Entzerrung mit minimierter Bitfehlerrate zu erreichen. apprentissage par un algorithme de gradient stochastique plus facile par comparaison g un rCseau multi-couche. L'algorithmeest robuste au bruit aberrant dans le cas de l'bqualiseur utilisk, et, de plus, il converge plus rapidement que la norme Iz. Une analyse dCtaillCe des performances tenant compte des problkmes numkiques lorsque p < 1 est don&e dans cet article. Les risultats de simulation montrent que le schCma a une vitesse de convergence plus rapide et des performances en terme de taux d'erreur par bit satisfaisant. 11 montre Cgalement que ce nouvel Cgaliseur est capable d'approcher les performances d'un tgaliseur a BER minimum.

show abstract

Rescaling of variables in back propagation learning

Cited by 81 publications

References 6 publications

Adaptive Algorithms for Neural Network Supervised Learning: A Deterministic Optimization Approach

Adaptive Algorithms for Neural Network Supervised Learning: A Deterministic Optimization Approach

Development and convergence analysis of training algorithms with local learning rate adaptation

A polynomial-perceptron based decision feedback equalizer with a robust learning algorithm

Contact Info

Product

Resources

About