2021
DOI: 10.1016/j.physa.2020.125517
|View full text |Cite
|
Sign up to set email alerts
|

Hidden unit specialization in layered neural networks: ReLU vs. sigmoidal activation

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

1
34
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 47 publications
(39 citation statements)
references
References 29 publications
1
34
0
Order By: Relevance
“…Phase change behavior of dynamical systems using the sigmoid and RELU activation functions are known in the literature in the context of generalization performance of deep neural networks [ 34 , 35 ]. In this section we present a complete proof of the bifurcation analysis of non-linear dynamical systems involving sigmoid activation function despite its connections with [ 34 , 35 ]. Our results in Section 5.1 provide a more complete picture of the behavior of the dynamics in all regimes and can be readily exploited to analyze the dynamics of ( 9 ) and ( 10 ).…”
Section: Resultsmentioning
confidence: 99%
“…Phase change behavior of dynamical systems using the sigmoid and RELU activation functions are known in the literature in the context of generalization performance of deep neural networks [ 34 , 35 ]. In this section we present a complete proof of the bifurcation analysis of non-linear dynamical systems involving sigmoid activation function despite its connections with [ 34 , 35 ]. Our results in Section 5.1 provide a more complete picture of the behavior of the dynamics in all regimes and can be readily exploited to analyze the dynamics of ( 9 ) and ( 10 ).…”
Section: Resultsmentioning
confidence: 99%
“…The 3000 images of corn kernels were trained in the yolov4-tiny model after preprocessing. In order to simplify the calculation process, Leaky ReLU was used as the activation function of CSPDarknet [26,27]. The expression of Leaky ReLU is shown in Equation (2):…”
Section: Evaluation Of the Yolov4-tiny Modelmentioning
confidence: 99%
“…The ST framework offers a clean venue for analyzing optimization-related aspects of neural network models in the spirit of physical reductionism. The success of DL models in the past decade has reinitiated a surge of interest in this framework, e.g., [6,27,40,45]. However, despite the long tradition in the statistical physics community and the wide effort put nowadays by the machine learning community, the perplexing geometry of problem (2) still seems to be out of reach of existing analytic tools in regimes encountered in practice.…”
mentioning
confidence: 99%
“…• We show that symmetry breaking makes it possible to derive analytic expressions for families of spurious minima in the form of fractional power series in terms of d and k. Crucially, in contrast to existing approaches which employ various limiting processes, e.g., [6,27,40,45,28,41,12,32,13], our method operates in the natural regime where d and k are finite.…”
mentioning
confidence: 99%