2018
DOI: 10.48550/arxiv.1803.04386
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches

Abstract: Stochastic neural net weights are used in a variety of contexts, including regularization, Bayesian neural nets, exploration in reinforcement learning, and evolution strategies. Unfortunately, due to the large number of weights, all the examples in a mini-batch typically share the same weight perturbation, thereby limiting the variance reduction effect of large mini-batches. We introduce flipout, an efficient method for decorrelating the gradients within a mini-batch by implicitly sampling pseudo-independent w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
81
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 62 publications
(84 citation statements)
references
References 12 publications
0
81
0
Order By: Relevance
“…Uncertainty Methods. We evaluate MC-Dropout (DO) [4], MC-DropConnect (DC) [11], Deep Ensembles (DE) [10], Direct Uncertainty Quantification (DUQ) [15], Variational Inference with Flipout (VI) [16], and Gradient-based uncertainty (GD) [12]. This selection covers scalable as well as approximate methods and recent advances.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Uncertainty Methods. We evaluate MC-Dropout (DO) [4], MC-DropConnect (DC) [11], Deep Ensembles (DE) [10], Direct Uncertainty Quantification (DUQ) [15], Variational Inference with Flipout (VI) [16], and Gradient-based uncertainty (GD) [12]. This selection covers scalable as well as approximate methods and recent advances.…”
Section: Methodsmentioning
confidence: 99%
“…This transforms the model into a stochastic one. Flipout [16] is used as an additional formulation on top of a stochastic perturbation model that reduces variance, greatly improving learning stability and performance.…”
Section: Deep Ensembles (De)mentioning
confidence: 99%
See 1 more Smart Citation
“…Comparing with the classical strategy that perturbs weight in the entire space [40,16,42,10], we focus on characterizing the weight loss landscape on the new task with respect to the important subspace representing old task. The important subspace can be effectively calculated based on the examples sampled from replay buffer M after each task training.…”
Section: Sharpness Evaluationmentioning
confidence: 99%
“…1. We propose an adaptation of the four state-of-the-art UQ methods in deep learning -Bayes By Backprop [3], Flipout [24], Neural Linear Model [23,15], and Deep Evidential Regression [2,13] -to the case of solving DEs. A DE can be expressed as Lu − f = 0, where L is the differential operator, u(x) is the solution that we wish to find on some (possibly multidimensional) domain x, and f is a known forcing function.…”
Section: Introductionmentioning
confidence: 99%