2021
DOI: 10.48550/arxiv.2110.13578
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Distributional Reinforcement Learning for Multi-Dimensional Reward Functions

Abstract: A growing trend for value-based reinforcement learning (RL) algorithms is to capture more information than scalar value functions in the value network. One of the most well-known methods in this branch is distributional RL, which models return distribution instead of scalar value. In another line of work, hybrid reward architectures (HRA) in RL have studied to model source-specific value functions for each source of reward, which is also shown to be beneficial in performance. To fully inherit the benefits of d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 5 publications
0
2
0
Order By: Relevance
“…MMD has been recently used in deep reinforcement learning to model the distribution of future returns (Nguyen-Tang et al, 2021;Zhang et al, 2021).…”
Section: Distance Between Probability Distributionsmentioning
confidence: 99%
See 1 more Smart Citation
“…MMD has been recently used in deep reinforcement learning to model the distribution of future returns (Nguyen-Tang et al, 2021;Zhang et al, 2021).…”
Section: Distance Between Probability Distributionsmentioning
confidence: 99%
“…It is worthwhile to note that the size of the prediction ensemble in DISCO Net/DN+ is arbitrary, and can be increased at no extra cost in the test phase, in contrast to the existing methods (Nguyen-Tang et al, 2021;Zhang et al, 2021) that have a fixed number of network outputs to model the ensemble.…”
Section: Generative Ensemble Prediction Based On Energy Scoresmentioning
confidence: 99%