2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00053
|View full text |Cite
|
Sign up to set email alerts
|

SSN: Learning Sparse Switchable Normalization via SparsestMax

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
17
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
4
2

Relationship

2
8

Authors

Journals

citations
Cited by 44 publications
(18 citation statements)
references
References 13 publications
1
17
0
Order By: Relevance
“…An intuitive explanation is that sparseness can effectively prevent the model from overfitting. The similar results are also presented in the recent proposed Sparse Switchable Normalization (SSN) [46]. It implies that we could increase sparsity in ratios to reduce computations of multiple normalizers while maintaining good performance.…”
Section: Ablation Studysupporting
confidence: 77%
“…An intuitive explanation is that sparseness can effectively prevent the model from overfitting. The similar results are also presented in the recent proposed Sparse Switchable Normalization (SSN) [46]. It implies that we could increase sparsity in ratios to reduce computations of multiple normalizers while maintaining good performance.…”
Section: Ablation Studysupporting
confidence: 77%
“…Moreover, investigating the other normalizers such as instance normalization (IN) (Ulyanov et al, 2016) and layer normalization (LN) (Ba et al, 2016) is also important. Understanding the characteristics of these normalizers should be the first step to analyze some recent best practices such as whitening (Luo, 2017b;a), switchable normalization Shao et al, 2019), and switchable whitening (Pan et al, 2019).…”
Section: Discussionmentioning
confidence: 99%
“…This type of methods can restore the performance in small batch cases to some extent. However, instance-level normalization hardly meet industrial or commercial needs so far, for this type of methods have to compute instance-level statistics both in training and inference, which will introduce additional nonlinear operations in inference procedure and dramatically increase consumption Shao et al (2019). While vanilla BN uses the statistics computed over the whole training data instead of batch of samples when training finished.…”
Section: Introductionmentioning
confidence: 99%