1991
DOI: 10.1016/0893-6080(91)90006-q
|View full text |Cite
|
Sign up to set email alerts
|

Rescaling of variables in back propagation learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0
1

Year Published

1993
1993
2013
2013

Publication Types

Select...
4
4
2

Relationship

0
10

Authors

Journals

citations
Cited by 81 publications
(25 citation statements)
references
References 6 publications
0
24
0
1
Order By: Relevance
“…When this type of error is generated by overflow, mostly occurred during the calculation of the term net l j , is not crucial due to the characteristics of the sigmoid neuron model. However the underflow error that occurs during the calculation of the backpropagating error signal [Rumelhart et al, 1986] denoted by δ j is very crucial and can lead to nonconvergence and saturation [Rigler et al, 1991]. Despite the fact that the error associated with neuron j can be significant, δ j may become negligible and rounded to zero if the derivative of the activation function f (net l j ) is very small.…”
Section: Imprecision Incurred In Supervised Trainingmentioning
confidence: 99%
“…When this type of error is generated by overflow, mostly occurred during the calculation of the term net l j , is not crucial due to the characteristics of the sigmoid neuron model. However the underflow error that occurs during the calculation of the backpropagating error signal [Rumelhart et al, 1986] denoted by δ j is very crucial and can lead to nonconvergence and saturation [Rigler et al, 1991]. Despite the fact that the error associated with neuron j can be significant, δ j may become negligible and rounded to zero if the derivative of the activation function f (net l j ) is very small.…”
Section: Imprecision Incurred In Supervised Trainingmentioning
confidence: 99%
“…These steps are usually constraint by problem-dependent heuristic parameters to ensure subminimization of the error function in each weight direction and hopefully obtain monotone error reduction. However, enforcing monotone error reduction using inappropriate values for the heuristic learning parameters can considerably slow the rate of training, or even lead to divergence and to premature saturation [11,19]. Moreover, it seems that using heuristics it is not possible to develop globally convergent training algorithms, i.e.…”
Section: Global Convergence By Adapting the Search Directionmentioning
confidence: 99%
“…Consequently, using I,-norm with p < 2 can treat I e(n) I in a 'robust' manner as discussed in the last section. Without loss of generality, the output is assumed to be uniformly distributed in [ -1, l] for simplicity [19] in the following analyses. Having this, 1 e(n) I will be uniformly distributed in [O, 21.…”
Section: Performance Analysis Of the I-norm Algorithmmentioning
confidence: 99%