2023
DOI: 10.48550/arxiv.2302.01687
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Better Training of GFlowNets with Local Credit and Incomplete Trajectories

Abstract: Generative Flow Networks or GFlowNets are related to Monte-Carlo Markov chain methods (as they sample from a distribution specified by an energy function), reinforcement learning (as they learn a policy to sample composed objects through a sequence of steps), generative models (as they learn to represent and sample from a distribution) and amortized variational methods (as they can be used to learn to approximate and sample from an otherwise intractable posterior, given a prior and a likelihood). They are trai… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
16
0

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(16 citation statements)
references
References 16 publications
0
16
0
Order By: Relevance
“…Partial inference is a promising paradigm to resolve this issue by incorporating local credits (Pan et al, 2023a). Specifically, the partial inference aims to evaluate individual transitions or sub-trajectories, i.e., local credits, and provide informative training signals for identifying the specific contributions of actions.…”
Section: Partial Inference For Gflownetsmentioning
confidence: 99%
See 4 more Smart Citations
“…Partial inference is a promising paradigm to resolve this issue by incorporating local credits (Pan et al, 2023a). Specifically, the partial inference aims to evaluate individual transitions or sub-trajectories, i.e., local credits, and provide informative training signals for identifying the specific contributions of actions.…”
Section: Partial Inference For Gflownetsmentioning
confidence: 99%
“…To sample from the Boltzmann distribution, GFlowNet trains the policy to assign action selection probability based on energy of terminal state (Bengio et al, 2021a;b;Malkin et al, 2022a), e.g., a high probability to the action responsible for the low terminal energy. However, such training has fundamental limitations in credit assignment, as it is hard to identify the action responsible for terminal energy (Pan et al, 2023a). This limitation stems from solely relying on the terminal energy associated with multiple actions, lacking the information to identify the contribution of individual actions, akin to challenges in RL with sparse reward (Arjona-Medina et al, 2019;Ren et al, 2022).…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations