Adrià Garriga-Alonso scite author profile

Adrià Garriga-Alonso

3Publications

75Citation Statements Received

115Citation Statements Given

How they've been cited

How they cite others

108

Affiliations

University of Cambridge

Publications

Order By: Most citations

Bayesian Neural Network Priors Revisited

Fortuin¹,

Garriga-Alonso²,

Ober³

et al. 2021

Preprint

View full text Add to dashboard Cite

Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. However, such simplistic priors are unlikely to either accurately reflect our true beliefs about the weight distributions, or to give optimal performance. We study summary statistics of neural network weights in different networks trained using SGD. We find that fully connected networks (FCNNs) display heavytailed weight distributions, while convolutional neural network (CNN) weights display strong spatial correlations. Building these observations into the respective priors leads to improved performance on a variety of image classification datasets. Moreover, we find that these priors also mitigate the cold posterior effect in FCNNs, while in CNNs we see strong improvements at all temperatures, and hence no reduction in the cold posterior effect.

show abstract

Data augmentation in Bayesian neural networks and the cold posterior effect

Nabarro¹,

Stoil²,

Garriga-Alonso³

et al. 2021

Preprint

View full text Add to dashboard Cite

Data augmentation is a highly effective approach for improving performance in deep neural networks. The standard view is that it creates an enlarged dataset by adding synthetic data, which raises a problem when combining it with Bayesian inference: how much data are we really conditioning on? This question is particularly relevant to recent observations linking data augmentation to the cold posterior effect. We investigate various principled ways of finding a log-likelihood for augmented datasets. Our approach prescribes augmenting the same underlying image multiple times, both at test and train-time, and averaging either the logits or the predictive probabilities. Empirically, we observe the best performance with averaging probabilities. While there are interactions with the cold posterior effect, neither averaging logits or averaging probabilities eliminates it. IntroductionData augmentation (Shorten & Khoshgoftaar, 2019) is a fundamental technique for obtaining high performance in modern neural networks (NNs). In computer vision, data augmentation involves creating synthetic training examples by making small modifications, such as a rotation or crop, to the input image.At the same time, Bayesian inference allows us to reason about uncertainty in neural network weights (MacKay, 1992;Welling & Teh, 2011;Blundell et al., 2015;Fortuin, 2021) given limited data. Bayesian inference is particularly important in safety-critical settings such as self-driving cars or medical imaging where it is crucial to be able to hand over to a human when uncertainty is too large. * equal contribution † equal contributionPreprint. Under review.

show abstract

BNNpriors: A library for Bayesian neural network inference with different prior distributions

et al. 2021

View full text Add to dashboard Cite

Bayesian neural networks have shown great promise in many applications where calibrated uncertainty estimates are crucial and can often also lead to a higher predictive performance. However, it remains challenging to choose a good prior distribution over their weights. While isotropic Gaussian priors are often chosen in practice due to their simplicity, they do not reflect our true prior beliefs well and can lead to suboptimal performance. Our new library, BNNpriors, enables state-of-the-art Markov Chain Monte Carlo inference on Bayesian neural networks with a wide range of predefined priors, including heavy-tailed ones, hierarchical ones, and mixture priors. Moreover, it follows a modular approach that eases the design and implementation of new custom priors. It has facilitated foundational discoveries on the nature of the cold posterior effect in Bayesian neural networks and will hopefully catalyze future research as well as practical applications in this area.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.