2015 IEEE Congress on Evolutionary Computation (CEC) 2015
DOI: 10.1109/cec.2015.7256883
|View full text |Cite
|
Sign up to set email alerts
|

Saturation in PSO neural network training: Good or evil?

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
10
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 14 publications
(11 citation statements)
references
References 11 publications
0
10
0
Order By: Relevance
“…Instead of using a random approach such as Latin Hypercube sampling, in the future, different deterministic and pseudo-random sampling strategies such as Sparse Grid sampling or Sobol Sequences can be employed to further improve the performance of the model. Furthermore, it is critical to obtain the statics of saturation along different parts of the solution domain during the training of DNNs (Glorot and Bengio, 2010;Rakitianskaia and Engelbrecht, 2015b). The saturation occurs when the hidden units of a DNN predominantly output values close to the asymptotic ends of the activation function range which reduces the particular PINNs model to a binary state, thus limiting the overall information capacity of the NN (Rakitianskaia and Engelbrecht, 2015a;Bai et al, 2019).…”
Section: Discussionmentioning
confidence: 99%
“…Instead of using a random approach such as Latin Hypercube sampling, in the future, different deterministic and pseudo-random sampling strategies such as Sparse Grid sampling or Sobol Sequences can be employed to further improve the performance of the model. Furthermore, it is critical to obtain the statics of saturation along different parts of the solution domain during the training of DNNs (Glorot and Bengio, 2010;Rakitianskaia and Engelbrecht, 2015b). The saturation occurs when the hidden units of a DNN predominantly output values close to the asymptotic ends of the activation function range which reduces the particular PINNs model to a binary state, thus limiting the overall information capacity of the NN (Rakitianskaia and Engelbrecht, 2015a;Bai et al, 2019).…”
Section: Discussionmentioning
confidence: 99%
“…The sigmoid is linear around the origin, and saturates (approaches asymptotes) for inputs of large magnitude. Neuron saturation is generally undesirable, since the gradient is very weak near the asymptotes, and may cause stagnation in the training algorithms [31].…”
Section: Activation Functionsmentioning
confidence: 99%
“…Bounded activation functions such as sigmoid and hyperbolic tangent (TanH) are prone to saturation, which was shown to be detrimental to NN performance for shallow [31] and deep [12] architectures alike. Modern activation functions such as rectified linear unit (ReLU) [27] and exponential linear unit (ELU) [7] are less prone to saturation, and thus became the primary choice for deep learning [1].…”
mentioning
confidence: 99%
“…While the previously mentioned literature discusses PSO's ability to train an ANN, none of the literature attempts to discuss why this may be. [37] hypothesized that the deficiency of PSO may be due to hidden layer saturation. [37] found that while a certain degree of saturation was required for ANN success, higher levels of saturation was found to be unsatisfactory and would lead to overfitting.…”
Section: Particle Swarm Optimizationmentioning
confidence: 99%
“…[37] hypothesized that the deficiency of PSO may be due to hidden layer saturation. [37] found that while a certain degree of saturation was required for ANN success, higher levels of saturation was found to be unsatisfactory and would lead to overfitting. [43] found that non-gradient based learning can be sensitive to the degree of saturation present in an ANN.…”
Section: Particle Swarm Optimizationmentioning
confidence: 99%