2023
DOI: 10.1016/j.neucom.2023.03.054
|View full text |Cite
|
Sign up to set email alerts
|

Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 9 publications
0
0
0
Order By: Relevance
“…In response to the diverse obstacle-avoidance decisions and their unique descriptive parameters, the hybrid action space can more effectively describe this situation, providing a research foundation. In addition, some scholars have developed deep reinforcement learning networks for hybrid action learning [32,33], which have been widely used in various decision-making problems such as games [34,35] and resource management [36,37].…”
Section: Introductionmentioning
confidence: 99%
“…In response to the diverse obstacle-avoidance decisions and their unique descriptive parameters, the hybrid action space can more effectively describe this situation, providing a research foundation. In addition, some scholars have developed deep reinforcement learning networks for hybrid action learning [32,33], which have been widely used in various decision-making problems such as games [34,35] and resource management [36,37].…”
Section: Introductionmentioning
confidence: 99%
“…Chen et al combined the preferential experience replay (PER) technique with the soft actor-critic (SAC) algorithm for deep reinforcement learning for path planning, which improves the sample utilization rate and the success rate of path planning [23]. Xu et al decoupled the decoupled hybrid action space, reduced robot crosstalk using a centralized training decentralized execution framework, and optimized the soft actor-critic (SAC) algorithm to improve the convergence and robustness of the algorithm [24]. Tian et al added hierarchical learning (HL) and particle swarm optimization (PSO) to an improved deep policy gradient (DDPG) algorithm to improve path planning by setting the buffer to improve the path accuracy, which, in turn, improves the convergence speed and accuracy of path planning [25].…”
Section: Introductionmentioning
confidence: 99%