Max B. Paulus scite author profile

Max B. Paulus

5Publications

28Citation Statements Received

130Citation Statements Given

How they've been cited

How they cite others

155

130

Affiliations

Publications

Order By: Most citations

A Review of the Gumbel-max Trick and its Extensions for Discrete Stochasticity in Machine Learning

Huijben

Kool

Paulus³

et al. 2023

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the "Taverne" license above, please follow below link for the End User

show abstract

A Review of the Gumbel-max Trick and its Extensions for Discrete Stochasticity in Machine Learning

Huijben¹,

Kool²,

Paulus³

et al. 2021

Preprint

View full text Add to dashboard Cite

The Gumbel-max trick is a method to draw a sample from a categorical distribution, given by its unnormalized (log-)probabilities. Over the past years, the machine learning community has proposed several extensions of this trick to facilitate, e.g., drawing multiple samples, sampling from structured domains, or gradient estimation for error backpropagation in neural network optimization. The goal of this survey article is to present background about the Gumbel-max trick, and to provide a structured overview of its extensions to ease algorithm selection. Moreover, it presents a comprehensive outline of (machine learning) literature in which Gumbel-based algorithms have been leveraged, reviews commonly-made design choices, and sketches a future perspective.

show abstract

Instance-wise algorithm configuration with graph neural networks

Romeo¹,

Ferrari²,

Scheurer³

et al. 2022

Preprint

View full text Add to dashboard Cite

Rao-Blackwellizing the Straight-Through Gumbel-Softmax Gradient Estimator

Paulus¹,

Maddison²,

Krause³

2020

Preprint

View full text Add to dashboard Cite

Gradient estimation in models with discrete latent variables is a challenging problem, because the simplest unbiased estimators tend to have high variance. To counteract this, modern estimators either introduce bias, rely on multiple function evaluations, or use learned, input-dependent baselines. Thus, there is a need for estimators that require minimal tuning, are computationally cheap, and have low mean squared error. In this paper, we show that the variance of the straight-through variant of the popular Gumbel-Softmax estimator can be reduced through Rao-Blackwellization without increasing the number of function evaluations. This provably reduces the mean squared error. We empirically demonstrate that this leads to variance reduction, faster convergence, and generally improved performance in two unsupervised latent variable models.

show abstract

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs

Miladinović¹,

Shridhar²,

Jain³

et al. 2022

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Max B. Paulus

A Review of the Gumbel-max Trick and its Extensions for Discrete Stochasticity in Machine Learning

A Review of the Gumbel-max Trick and its Extensions for Discrete Stochasticity in Machine Learning

Instance-wise algorithm configuration with graph neural networks

Rao-Blackwellizing the Straight-Through Gumbel-Softmax Gradient Estimator

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs

Contact Info

Product

Resources

About