2020
DOI: 10.1016/j.patrec.2020.06.025
|View full text |Cite
|
Sign up to set email alerts
|

If dropout limits trainable depth, does critical initialisation still matter? A large-scale statistical analysis on ReLU networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 13 publications
0
3
0
Order By: Relevance
“…We first consider the effect of weight perturbation by sparsifying connections as in equation (15). For a concrete example, we consider DNN with L = 4, uniform layer width α l = 1 and disconnection probability p l = 1/2, for which we compute the large deviation rate function I(q L ) = Φ(Q * , q * , .…”
Section: Weight Sparsificationmentioning
confidence: 99%
See 2 more Smart Citations
“…We first consider the effect of weight perturbation by sparsifying connections as in equation (15). For a concrete example, we consider DNN with L = 4, uniform layer width α l = 1 and disconnection probability p l = 1/2, for which we compute the large deviation rate function I(q L ) = Φ(Q * , q * , .…”
Section: Weight Sparsificationmentioning
confidence: 99%
“…A recent line of research utilizes the mean field theory in statistical physics to investigate various DNN characteristics, such as expressive power [9], Gaussian process-like behaviors of wide DNN [10][11][12], dynamical stability in layer propagation and its impact on weight initialization [13][14][15] and function similarity and entropy in the function space [16]. By assuming large layer-width and random weights, such techniques harness the specific type of nonlinearity used and many degrees of freedom to provide valuable analytical insights.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation