Nick K. Treadgold scite author profile

IEEE Trans. Neural Netw.

1998

110

A problem with gradient descent algorithms is that they can converge to poorly performing local minima. Global optimization algorithms address this problem, but at the cost of greatly increased training times. This work examines combining gradient descent with the global optimization technique of simulated annealing (SA). Simulated annealing in the form of noise and weight decay is added to resiliant backpropagation (RPROP), a powerful gradient descent algorithm for training feedforward neural networks. The resulting algorithm, SARPROP, is shown through various simulations not only to be able to escape local minima, but is also able to maintain, and often improve the training times of the RPROP algorithm. In addition, SARPROP may be used with a restart training phase which allows a more thorough search of the error surface and provides an automatic annealing schedule.

show abstract

Exploring constructive cascade networks

IEEE Trans. Neural Netw.

1999

Constructive algorithms have proved to be powerful methods for training feedforward neural networks. An important property of these algorithms is generalization. A series of empirical studies were performed to examine the effect of regularization on generalization in constructive cascade algorithms. It was found that the combination of early stopping and regularization resulted in better generalization than the use of early stopping alone. A cubic penalty term that greatly penalizes large weights was shown to be beneficial for generalization in cascade networks. An adaptive method of setting the regularization magnitude in constructive algorithms was introduced and shown to produce generalization results similar to those obtained with a fixed, user-optimized regularization setting. This adaptive method also resulted in the construction of smaller networks for more complex problems. The acasper algorithm, which incorporates the insights obtained from the empirical studies, was shown to have good generalization and network construction properties. This algorithm was compared to the cascade correlation algorithm on the Proben 1 and additional regression data sets.

show abstract

A cascade network algorithm employing Progressive RPROP

1997

Extending and benchmarking the CasPer algorithm

1997

Exploring architecture variations in constructive cascade networks

Increased generalization through selective decay in a constructive cascade network

Gedcon²

Determining the optimum amount of regularization to obtain the best generalization performance in feedforward neural networks is a difficult problem, and is a form of the biasvariance dilemma. This problem is addressed in the CasPer algorithm, a constructive cascade algorithm that uses weight decay. Previously the amount of weight decay used by this algorithm was set by a parameter prior to training, often by trial and error. This is overcome through the use of a pool of neurons which are candidates for insertion into the network. Each neuron in the pool has an associated decay level, and the one which produces the best generalization on a validation set is added to the network. This not only removes the need for the user to select a decay value, but results in better generalization compared to networks with fixed, user optimized, decay values.

show abstract

A constructive cascade network with adaptive regularisation

1999

Constructing higher order neurons of increasing complexity in cascade networks

1998