Distributional Reinforcement Learning for Multi-Dimensional Reward Functions

Zhang, Pushi; Chen, Xiaoyu; Li, Zhao; Xiong, Wei; Qin, Tao; Liu, Tie-Yan

doi:10.48550/arxiv.2110.13578

Cited by 1 publication

(2 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…MMD has been recently used in deep reinforcement learning to model the distribution of future returns (Nguyen-Tang et al, 2021;Zhang et al, 2021).…”

Section: Distance Between Probability Distributionsmentioning

confidence: 99%

See 1 more Smart Citation

Sample-based Uncertainty Quantification with a Single Deterministic Neural Network

Kanazawa¹,

Gupta²

2022

Preprint

View full text Add to dashboard Cite

Development of an accurate, flexible, and numerically efficient uncertainty quantification (UQ) method is one of fundamental challenges in machine learning. Previously, a UQ method called DISCO Nets has been proposed (Bouchacourt et al., 2016) that trains a neural network by minimizing the so-called energy score on training data. This method has shown superior performance on a hand pose estimation task in computer vision, but it remained unclear whether this method works as nicely for regression on tabular data, and how it competes with more recent advanced UQ methods such as NGBoost. In this paper, we propose an improved neural architecture of DISCO Nets that admits a more stable and smooth training. We benchmark this approach on miscellaneous real-world tabular datasets and confirm that it is competitive with or even superior to standard UQ baselines. We also provide a new elementary proof for the validity of using the energy score to learn predictive distributions. Further, we point out that DISCO Nets in its original form ignore epistemic uncertainty and only capture aleatoric uncertainty. We propose a simple fix to this problem.

show abstract

“…MMD has been recently used in deep reinforcement learning to model the distribution of future returns (Nguyen-Tang et al, 2021;Zhang et al, 2021).…”

Section: Distance Between Probability Distributionsmentioning

confidence: 99%

“…It is worthwhile to note that the size of the prediction ensemble in DISCO Net/DN+ is arbitrary, and can be increased at no extra cost in the test phase, in contrast to the existing methods (Nguyen-Tang et al, 2021;Zhang et al, 2021) that have a fixed number of network outputs to model the ensemble.…”

Section: Generative Ensemble Prediction Based On Energy Scoresmentioning

confidence: 99%

Sample-based Uncertainty Quantification with a Single Deterministic Neural Network

Kanazawa¹,

Gupta²

2022

Preprint

View full text Add to dashboard Cite

show abstract

Distributional Reinforcement Learning for Multi-Dimensional Reward Functions

Cited by 1 publication

References 5 publications

Sample-based Uncertainty Quantification with a Single Deterministic Neural Network

Sample-based Uncertainty Quantification with a Single Deterministic Neural Network

Contact Info

Product

Resources

About