2021
DOI: 10.48550/arxiv.2110.06435
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Dropout Prediction Uncertainty Estimation Using Neuron Activation Strength

Abstract: It is well-known that deep neural networks generate different predictions even given the same model configuration and training dataset. It thus becomes more and more important to study prediction variation, the variation of the predictions on a given input example, in neural network models. Dropout has been commonly used in various applications to quantify prediction variations. However, using dropout in practice can be expensive as it requires running dropout inferences many times to estimate prediction varia… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
4
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 35 publications
0
4
0
Order By: Relevance
“…Unlike classification applications that focus on the final label, in recommendation systems, the exact predicted engagement probabilities can make a difference. Theoretically, we can use different statistics averaged over multiple models [11,46,53] such as standard deviations or KL divergences. In [46], various 𝐿 𝑝 norms relative to an average prediction of a set of models were considered.…”
Section: Prediction Differencementioning
confidence: 99%
See 2 more Smart Citations
“…Unlike classification applications that focus on the final label, in recommendation systems, the exact predicted engagement probabilities can make a difference. Theoretically, we can use different statistics averaged over multiple models [11,46,53] such as standard deviations or KL divergences. In [46], various 𝐿 𝑝 norms relative to an average prediction of a set of models were considered.…”
Section: Prediction Differencementioning
confidence: 99%
“…Despite its importance, it received very little attention in academic publications. Only recently, a series of empirical works [11,13,17,[46][47][48][49]53] demonstrated it. An initial theoretical framework for reproducibility in optimization only appears in very recent work [2] and demonstrates the problem for the much simpler case of convex optimization.…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…Beyond practical deployments of machine learned models, the reproducibility crisis in the machine learning academic world has also been well-documented: see [Pineau et al, 2021] and the references therein for an excellent discussion of the reasons for irreproducibility (insufficient exploration of hyperparameters and experimental setups, lack of sufficient documentation, inaccessible code, and different computational hardware) and for mitigation recommendations. However, recent papers , D'Amour et al, 2020, Dusenberry et al, 2020, Snapp and Shamir, 2021, Summers and Dinneen, 2021, Yu et al, 2021 have also demonstrated that even when models are trained on identical datasets with identical optimization algorithms, architectures, and hyperparameters, they can produce significantly different predictions on the same example. This type of irreproducibility may be caused by multiple factors [D'Amour et al, 2020, Fort et al, 2020, Frankle et al, 2020, Shallue et al, 2018, Snapp and Shamir, 2021, Summers and Dinneen, 2021, such as non-convexity of the objective, random initialization, nondeterminism in training such as data shuffling, parallelism, random schedules, hardware used, and round off quantization errors.…”
Section: Introductionmentioning
confidence: 99%