2019
DOI: 10.48550/arxiv.1905.01413
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ARSM: Augment-REINFORCE-Swap-Merge Estimator for Gradient Backpropagation Through Categorical Variables

Mingzhang Yin,
Yuguang Yue,
Mingyuan Zhou

Abstract: To address the challenge of backpropagating the gradient through categorical variables, we propose the augment-REINFORCE-swap-merge (ARSM) gradient estimator that is unbiased and has low variance. ARSM first uses variable augmentation, REINFORCE, and Rao-Blackwellization to re-express the gradient as an expectation under the Dirichlet distribution, then uses variable swapping to construct differently expressed but equivalent expectations, and finally shares common random numbers between these expectations to a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 14 publications
0
2
0
Order By: Relevance
“…2. Storchastic should be flexible enough to allow implementing a wide range of reviewed gradient estimation methods, including score-function-based methods with complex sampling techniques [23,25,50] or control variates [14,46], and other methods such as measurevalued derivatives [16,38] and SPSA [45] which are missing AD implementations [34]. 3.…”
Section: Requirements Of the Storchastic Frameworkmentioning
confidence: 99%
See 1 more Smart Citation
“…2. Storchastic should be flexible enough to allow implementing a wide range of reviewed gradient estimation methods, including score-function-based methods with complex sampling techniques [23,25,50] or control variates [14,46], and other methods such as measurevalued derivatives [16,38] and SPSA [45] which are missing AD implementations [34]. 3.…”
Section: Requirements Of the Storchastic Frameworkmentioning
confidence: 99%
“…We show these conditions hold for a wide variety of gradient estimation methods for first order differentiation. For score-function related methods such as RELAX [14], ARM [50,51], MAPO [25] and unordered set esimator [23], the conditions also hold for any-order differentiation. Storchastic enables us to only have to prove these conditions locally, meaning that we can combine different gradient estimation methods for different stochastic nodes.…”
Section: Introductionmentioning
confidence: 99%