Arsenii Ashukha scite author profile

In this work, we investigate Batch Normalization technique and propose its probabilistic interpretation. We propose a probabilistic model and show that Batch Normalization maximazes the lower bound of its marginalized log-likelihood. Then, according to the new probabilistic model, we design an algorithm which acts consistently during train and test. However, inference becomes computationally inefficient. To reduce memory and computational cost, we propose Stochastic Batch Normalization -an efficient approximation of proper inference procedure. This method provides us with a scalable uncertainty estimation technique. We demonstrate the performance of Stochastic Batch Normalization on popular architectures (including deep convolutional architectures: VGG-like and ResNets) for MNIST and CIFAR-10 datasets.

show abstract

Automating Control of Overestimation Bias for Reinforcement Learning

Kuznetsov

Grishin²,

Tsypin³

et al. 2022

Preprint

View full text Add to dashboard Cite

Majority of high-performing off-policy reinforcement learning algorithms use aggregated overestimation bias control techniques.However, most of them rely on a pre-defined bias correction policies that are either not flexible enough or require environment-specific tuning of hyperparameter.In this work, we present a data-driven approach for automatic bias control.We demonstrate its effectiveness on three algorithms: Truncated Quantile Critics, Weighted Delayed DDPG and Maxmin Q-learning. Our approach eliminates the need for an extensive hyperparameter search.We show that it leads to the significant reduction of the actual number of interactions while, in most cases, matching the performance of a resource demanding grid search method.While on average the reduction of the bias improves the performance, elimination of the aggregated bias does not always lead to the best performance. To the best of our knowledge, that is the first case where it is proven on complex environments which highlights the important pitfalls of overestimation control.

show abstract

Greedy Policy Search: A Simple Baseline for Learnable Test-Time Augmentation

Molchanov¹,

Lyzhov²,

Molchanova³

et al. 2020

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Arsenii Ashukha

Resolution-robust Large Mask Inpainting with Fourier Convolutions

Resolution-robust Large Mask Inpainting with Fourier Convolutions

Uncertainty Estimation via Stochastic Batch Normalization

Automating Control of Overestimation Bias for Reinforcement Learning

Greedy Policy Search: A Simple Baseline for Learnable Test-Time Augmentation

Contact Info

Product

Resources

About