Andrei Atanov scite author profile

Andrei Atanov

10Publications

96Citation Statements Received

80Citation Statements Given

How they've been cited

130

How they cite others

131

Affiliations

École Polytechnique Fédérale de Lausanne, National Research University Higher School of Economics, Samsung (Russia)

Publications

Order By: Most citations

Uncertainty Estimation via Stochastic Batch Normalization

Atanov

Ashukha

Molchanov

et al. 2019

View full text Add to dashboard Cite

In this work, we investigate Batch Normalization technique and propose its probabilistic interpretation. We propose a probabilistic model and show that Batch Normalization maximazes the lower bound of its marginalized log-likelihood. Then, according to the new probabilistic model, we design an algorithm which acts consistently during train and test. However, inference becomes computationally inefficient. To reduce memory and computational cost, we propose Stochastic Batch Normalization -an efficient approximation of proper inference procedure. This method provides us with a scalable uncertainty estimation technique. We demonstrate the performance of Stochastic Batch Normalization on popular architectures (including deep convolutional architectures: VGG-like and ResNets) for MNIST and CIFAR-10 datasets.

show abstract

MultiMAE: Multi-modal Multi-task Masked Autoencoders

Bachmann

Mizrahi

Atanov

et al. 2022

View full text Add to dashboard Cite

MultiMAE: Multi-modal Multi-task Masked Autoencoders

Bachmann¹,

Mizrahi²,

Atanov³

et al. 2022

Preprint

View full text Add to dashboard Cite

Uncertainty Estimation via Stochastic Batch Normalization

Atanov¹,

Ashukha²,

Molchanov³

et al. 2018

Preprint

View full text Add to dashboard Cite

The Deep Weight Prior

Atanov¹,

Ashukha²,

Struminsky³

et al. 2018

Preprint

View full text Add to dashboard Cite

Bayesian inference is known to provide a general framework for incorporating prior knowledge or specific properties into machine learning models via carefully choosing a prior distribution. In this work, we propose a new type of prior distributions for convolutional neural networks, deep weight prior (dwp), that exploit generative models to encourage a specific structure of trained convolutional filters e.g., spatial correlations. We define dwp in a form of an implicit distribution and propose a method for variational inference with such type of implicit priors. In experiments, we show that dwp improves the performance of Bayesian neural networks when training data are limited, and initialization of weights with samples from dwp accelerates training of conventional convolutional neural networks.

show abstract

3D Common Corruptions and Data Augmentation

Kar

Yeo

Atanov

et al. 2022

View full text Add to dashboard Cite

Simple Control Baselines for Evaluating Transfer Learning

Atanov¹,

Xu²,

Beker³

et al. 2022

Preprint

View full text Add to dashboard Cite

Transfer learning has witnessed remarkable progress in recent years, for example, with the introduction of augmentation-based contrastive self-supervised learning methods. While a number of large-scale empirical studies on the transfer performance of such models have been conducted, there is not yet an agreed-upon set of control baselines, evaluation practices, and metrics to report, which often hinders a nuanced and calibrated understanding of the real efficacy of the methods. We share an evaluation standard that aims to quantify and communicate transfer learning performance in an informative and accessible setup. This is done by baking a number of simple yet critical control baselines in the evaluation method, particularly the 'blind-guess' (quantifying the dataset bias), 'scratchmodel' (quantifying the architectural contribution), and 'maximal-supervision' (quantifying the upper-bound). To demonstrate how the evaluation standard can be employed, we provide an example empirical study investigating a few basic questions about self-supervised learning. For example, using this standard, the study shows the effectiveness of existing self-supervised pre-training methods is skewed towards image classification tasks versus dense pixel-wise predictions. In general, we encourage using/reporting the suggested control baselines in evaluating transfer learning in order to gain a more meaningful and informative understanding.

show abstract

3D Common Corruptions and Data Augmentation

Kar¹,

Yeo²,

Atanov³

et al. 2022

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Andrei Atanov

Uncertainty Estimation via Stochastic Batch Normalization

MultiMAE: Multi-modal Multi-task Masked Autoencoders

MultiMAE: Multi-modal Multi-task Masked Autoencoders

Uncertainty Estimation via Stochastic Batch Normalization

The Deep Weight Prior

3D Common Corruptions and Data Augmentation

Simple Control Baselines for Evaluating Transfer Learning

3D Common Corruptions and Data Augmentation

Contact Info

Product

Resources

About