Stochastic Normalizations as Bayesian Learning

Shekhovtsov, Alexander; Flach, Boris

doi:10.1007/978-3-030-20890-5_30

Cited by 10 publications

(7 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We further investigate how batch size affects the training performance of batch normalized networks (Figure 1), from the perspective of a model's representational capacity. Several works [41,18,17] have shown that batch size is related to the magnitude of stochasticity [2,46] introduced by BN, which also affects the model's training performance. However, the stochasticity analysis [18] is specific to normalization along the batch dimension, and cannot explain why GN with a large group number has significantly worse performance (Figure 2), while our work provides a unified analysis for batch and group normalized networks.…”

Section: Discussion Of Previous Workmentioning

confidence: 99%

“…BN standardizes the activations within a mini-batch of data, which improves the conditioning of optimization and accelerates training [20,3,40]. The stochasticity of normalization introduced along the batch dimension is believed to benefit generalization [51,41,18]. However, this stochasticity also results in differences between the training distribution (using mini-batch statistics) and the test distribution (using estimated population statistics) [19], which is believed to be the main cause of BN's smallbatch-size problem -BN's error increases rapidly as the batch size becomes smaller [51].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Group Whitening: Balancing Learning Efficiency and Representational Capacity

Huang¹,

Zhou²,

Liu³

et al. 2020

Preprint

View full text Add to dashboard Cite

Batch normalization (BN) is an important technique commonly incorporated into deep learning models to perform standardization within mini-batches. The merits of BN in improving model's learning efficiency can be further amplified by applying whitening, while its drawbacks in estimating population statistics for inference can be avoided through group normalization (GN). This paper proposes group whitening (GW), which elaborately exploits the advantages of the whitening operation and avoids the disadvantages of normalization within mini-batches. Specifically, GW divides the neurons of a sample into groups for standardization, like GN, and then further decorrelates the groups. In addition, we quantitatively analyze the constraint imposed by normalization, and show how the batch size (group number) affects the performance of batch (group) normalized networks, from the perspective of model's representational capacity. This analysis provides theoretical guidance for applying GW in practice. Finally, we apply the proposed GW to ResNet and ResNeXt architectures and conduct experiments on the ImageNet and COCO benchmarks. Results show that GW consistently improves the performance of different architectures, with absolute gains of 1.02% ∼ 1.49% in top-1 accuracy on ImageNet and 1.82% ∼ 3.21% in bounding box AP on COCO.Preprint. Under review.

show abstract

Section: Discussion Of Previous Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Group Whitening: Balancing Learning Efficiency and Representational Capacity

Huang¹,

Zhou²,

Liu³

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…The proposed DJ approach is flexible procedure which is usable to a wide range of DL methods. Shekhovtsov et al [390] exploited the cause of enhanced the generalization performance of deep networks due to Batch Normalization (BN). They argued that randomness of batch statistics was one of the prime reasons.…”

Section: Other Uq Techniquesmentioning

confidence: 99%

A review of uncertainty quantification in deep learning: Techniques, applications and challenges

et al. 2021

View full text Add to dashboard Cite

“…One important property of BN is its ability to improve the generalization of DNNs. It is believed such an improvement is obtained from the stochasticity/noise introduced by normalization over batch data [8], [105], [205]. It is clear that both the normalized output (Eqn.17) and the population statistics (Eqn.18) can be viewed as stochastic variables, because they depend on the minibatch inputs, which are sampled over datasets.…”

Section: Stochasticity For Generalizationmentioning

confidence: 99%

Normalization Techniques in Training DNNs: Methodology, Analysis and Application

Huang¹,

Qin²,

Zhou³

et al. 2020

Preprint

View full text Add to dashboard Cite

Normalization techniques are essential for accelerating the training and improving the generalization of deep neural networks (DNNs), and have successfully been used in various applications. This paper reviews and comments on the past, present and future of normalization methods in the context of DNN training. We provide a unified picture of the main motivation behind different approaches from the perspective of optimization, and present a taxonomy for understanding the similarities and differences between them. Specifically, we decompose the pipeline of the most representative normalizing activation methods into three components: the normalization area partitioning, normalization operation and normalization representation recovery. In doing so, we provide insight for designing new normalization technique. Finally, we discuss the current progress in understanding normalization methods, and provide a comprehensive review of the applications of normalization for particular tasks, in which it can effectively solve the key issues.

show abstract

Stochastic Normalizations as Bayesian Learning

Cited by 10 publications

References 7 publications

Group Whitening: Balancing Learning Efficiency and Representational Capacity

Group Whitening: Balancing Learning Efficiency and Representational Capacity

A review of uncertainty quantification in deep learning: Techniques, applications and challenges

Normalization Techniques in Training DNNs: Methodology, Analysis and Application

Contact Info

Product

Resources

About