2020
DOI: 10.1007/s42979-020-00312-x
|View full text |Cite
|
Sign up to set email alerts
|

LayerOut: Freezing Layers in Deep Neural Networks

Abstract: Deep networks involve a huge amount of computation during the training phase and are prone to over-fitting. To ameliorate these, several conventional techniques such as DropOut, DropConnect, Guided Dropout, Stochastic Depth, and BlockDrop have been proposed. These techniques regularize a neural network by dropping nodes, connections, layers, or blocks within the network. However, these conventional regularization techniques suffers from limitation that, they are suited either for fully connected networks or Re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 17 publications
(10 citation statements)
references
References 17 publications
0
6
0
Order By: Relevance
“…Therefore, in this study, only the dense layer or convolutional blocks were set to be trainable, followed by the output of three soybean tolerance classes. Other layers of each pre-trained model were frozen and their weights were not updated by the optimizer during training processing in order to reduce the risk of overfitting [ 38 ]. Different models have the same options for tuning parameters in order to compare them at the same level.…”
Section: Methodsmentioning
confidence: 99%
“…Therefore, in this study, only the dense layer or convolutional blocks were set to be trainable, followed by the output of three soybean tolerance classes. Other layers of each pre-trained model were frozen and their weights were not updated by the optimizer during training processing in order to reduce the risk of overfitting [ 38 ]. Different models have the same options for tuning parameters in order to compare them at the same level.…”
Section: Methodsmentioning
confidence: 99%
“…Thus, the appropriate learning rate depends on the model architecture and the training dataset. During the fine-tuning process, reducing the number of trainable layers with freezing some layers can shorten the training time since the number of parameters to be updated is reduced [23]. Xiao et al demonstrated that freezing some layers during training process might improve the model accuracy if the less updated layers are frozen [24].…”
Section: Fine-tuning Of Prediction Modelsmentioning
confidence: 99%
“…In summary, achieving a fast convergence and high accuracy requires keeping the trained layers in full precision (activation and parameters in the forward and backward pass). The work in [6] stochastically freezes layers of an NN to speed up training, keeping the frozen layers at full precision which limits the achievable speedup. Also, due to its stochastic nature, it is not applicable to a hard computation constraint.…”
Section: Quantization and Freezing In Centralizedmentioning
confidence: 99%
“…However, quantized gradient computation still suffers from reduced accuracy [7]. Another branch of work studies partial freezing of parameters during training to reduce the number of gradients to be computed [6]. The performance gains, however, are limited, especially if layers towards the beginning of the NN are trained, which requires expensive backpropagation through most layers.…”
Section: Introductionmentioning
confidence: 99%