2018
DOI: 10.1609/aaai.v32i1.11798
|View full text |Cite
|
Sign up to set email alerts
|

Action Branching Architectures for Deep Reinforcement Learning

Abstract: Discrete-action algorithms have been central to numerous recent successes of deep reinforcement learning. However, applying these algorithms to high-dimensional action tasks requires tackling the combinatorial increase of the number of possible actions with the number of action dimensions. This problem is further exacerbated for continuous-action tasks that require fine control of actions via discretization. In this paper, we propose a novel neural architecture featuring a shared decision module followed by se… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
45
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 116 publications
(46 citation statements)
references
References 20 publications
(21 reference statements)
1
45
0
Order By: Relevance
“…To tackle this problem, the BDQ algorithm in (Tavakoli et al, 2018) provides a special network structure, namely, branching dueling deep Q-network (or branching network for short), that allows the number of outputs of deep Q-networks to linearly increases with the number of components as illustrated in figure 2.…”
Section: Branching Dueling Q-learning (Bdq)mentioning
confidence: 99%
See 2 more Smart Citations
“…To tackle this problem, the BDQ algorithm in (Tavakoli et al, 2018) provides a special network structure, namely, branching dueling deep Q-network (or branching network for short), that allows the number of outputs of deep Q-networks to linearly increases with the number of components as illustrated in figure 2.…”
Section: Branching Dueling Q-learning (Bdq)mentioning
confidence: 99%
“…To address these issues, in this paper, we customize WQMIX to effectively optimize maintenance decisions of large-scale multi-component systems for the fully observable setting. In particular, separate agent networks are replaced by a single branching dueling network (branching network) (Tavakoli, Pardo, & Kormushev, 2018) to take advantage of the fully observable setting. The branching structure allows achieving a linear increase in the size of deep Q-networks's output layer to avoid the cure of dimentionality.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In order to take the fourth expected contribution into account, we customize a MADRL algorithm, namely WQMIX (Rashid, Farquhar, Peng, & Whiteson, 2020), in the case where system states can be fully observed to obtain cost-effective policies. Indeed, the algorithm takes advantage of the branching dueling network architecture (Tavakoli, Pardo, & Kormushev, 2018) to allow achieving linear increase in the size of the output layer of deep Q-networks when the number of system components grows and the monotonic decomposition scheme for joint action-value functions (Rashid et al, 2020) to enable maintenance decision-making consistency at component and system level.…”
Section: Current Workmentioning
confidence: 99%
“…The low-level problem is considered as a discrete control problem with two action dimensions: price and quantity. We utilize the Branching Dueling Q-Network (Tavakoli, Pardo, and Kormushev 2018). Formally, we have two action dimensions with |p l | = n p discrete relative price levels and |q l | = n q discrete quantity proportions.…”
Section: Low-level Rl With Action Branchingmentioning
confidence: 99%