2009 International Joint Conference on Neural Networks 2009
DOI: 10.1109/ijcnn.2009.5178798
|View full text |Cite
|
Sign up to set email alerts
|

Improving gradient-based learning algorithms for large scale feedforward networks

Abstract: Large scale neural networks have many hundreds or thousands of parameters (weights and biases) to learn, and as a result tend to have very long training times. Small scale networks can be trained quickly by using second-order information, but these fail for large architectures due to high computational cost. Other approaches employ local search strategies, which also add to the computational cost. In this paper we present a simple method, based on opposite transfer functions which greatly improve the convergen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2010
2010
2019
2019

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(4 citation statements)
references
References 19 publications
0
4
0
Order By: Relevance
“…-Additionally, the number of neurons in the first layer has been treated as a variable parameter, in order to find the optimal number of neurons that is adequate to ensure a higher accuracy of the model without overfitting the data or be stuck in local optima [7].…”
Section: Network Testing and Validation Of Modulating The Two-hexagonmentioning
confidence: 99%
See 1 more Smart Citation
“…-Additionally, the number of neurons in the first layer has been treated as a variable parameter, in order to find the optimal number of neurons that is adequate to ensure a higher accuracy of the model without overfitting the data or be stuck in local optima [7].…”
Section: Network Testing and Validation Of Modulating The Two-hexagonmentioning
confidence: 99%
“…They are a suitable means for the problem of the analytical description of a set of data that appear in many sciences and applications and can be identified as a data-modeling problem or of a given system's identification. The desired task is generally achieved by feeding the neural models by several input-output values to expect one or more output quantity through a learning process that consists of adjusting the synaptic weights by using many learning algorithms to update these weights [6][7][8]. The algorithms of second order such as Levenberg-Marquardt [9], which take into account the second derivative, are ranked among the most efficient learning algorithms, because they give good modeling of the systems.…”
Section: Introductionmentioning
confidence: 99%
“…Ventresca and Tizhoosh [15] utilized the opposition-based computation concept to improve the performance of large scale neural networks that have hundreds or thousands of parameters. During the learning process, in each iteration a set of neurons were selected based on a probabilistic rule.…”
Section: B Neural Networkmentioning
confidence: 99%
“…These simple and efficient meta-heuristic methods mainly include Differential Evolution (DE) [3,6,7,11,12], Particle Swarm Optimization (PSO) [13][14][15], Reinforcement Learning (RL) [2,16], Biogeography-Based Optimization (BBO) [4,17], Artificial Neural Network (ANN) [18,19], Harmony Search (HS) [20,21], Ant Colony System (ACS) [22,23] and Artificial Bee Colony (ABC) [24,25]. At present, the most successful applications of the ideas of OBL and its variants focus on traditional optimization fields, such as large-scale unconstrained optimization problem, constrained optimization problem, multi-objective optimization problem, and optimization problem in noisy environment.…”
Section: Introductionmentioning
confidence: 99%