2022
DOI: 10.1109/access.2022.3203072
|View full text |Cite
|
Sign up to set email alerts
|

Improved Q-Learning Applied to Dynamic Obstacle Avoidance and Path Planning

Abstract: Due to the complexity of interactive environments, dynamic obstacle avoidance path planning poses a significant challenge to agent mobility. Dynamic path planning is a complex multi-constraint combinatorial optimization problem. Some existing algorithms easily fall into local optimization when solving such problems, leading to defects in convergence speed and accuracy. Reinforcement learning has certain advantages in solving decision sequence problems in complex environments. A Q-learning algorithm is a reinfo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 35 publications
0
3
0
Order By: Relevance
“…There are different approaches to reinforcement learning techniques, the more common approaches used in path planning are Monte Carlos [64], Q-learning [65], Deep Q Network [66], Twin Delayed Deep Deterministic policy gradient algorithm [67], [68], Deep Deterministic Policy Gradient [69], [70], Soft Actor-Critic [71], Asynchronous Advantage Actor-Critic algorithm [72], [73], Trust Region Policy Optimization [74] and Proximal Policy Optimization (PPO) [72]. Each RL technique offers a performance improvement to the path planning methods.…”
Section: Figure 11 Reinforcement Learning Processmentioning
confidence: 99%
“…There are different approaches to reinforcement learning techniques, the more common approaches used in path planning are Monte Carlos [64], Q-learning [65], Deep Q Network [66], Twin Delayed Deep Deterministic policy gradient algorithm [67], [68], Deep Deterministic Policy Gradient [69], [70], Soft Actor-Critic [71], Asynchronous Advantage Actor-Critic algorithm [72], [73], Trust Region Policy Optimization [74] and Proximal Policy Optimization (PPO) [72]. Each RL technique offers a performance improvement to the path planning methods.…”
Section: Figure 11 Reinforcement Learning Processmentioning
confidence: 99%
“…Further, in the study by Orozco-Rosas et al [13], a new path-planning algorithm with excellent performance was developed by combining Q-learning and the artificial potential field method. Wang et al [19] introduced priority weights into the Q-learning algorithm and applied the algorithm to dynamic obstacle-avoidance path planning.…”
Section: A Path Planningmentioning
confidence: 99%
“…Some existing algorithms can easily get stuck in local optima when avoiding moving obstacles in complex interactive environments, resulting in defects in convergence speed and accuracy. An improved Q-learning algorithm was proposed, which greatly improves the convergence speed and accuracy of the algorithm and finds better paths in dynamic obstacle path planning [14]. However, as the problem size increases, the Q table in Q-learning algorithms expands, increasing the complexity of the algorithm.…”
Section: Introductionmentioning
confidence: 99%