2021
DOI: 10.1155/2021/4294841
|View full text |Cite
|
Sign up to set email alerts
|

ImprovedQ‐Learning Method for Multirobot Formation and Path Planning with Concave Obstacles

Abstract: Aiming at the formation and path planning of multirobot systems in an unknown environment, a path planning method for multirobot formation based on improved Q -learning is proposed. Based on the leader-following approach, the leader robot uses an improved Q -learning algorithm to plan the path and the follower robot achieves a tracking str… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
1
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 19 publications
0
1
0
Order By: Relevance
“…With the help of artificial intelligence, all fields of society have shown a broad development space, which not only changes the development mode of various fields but also changes the overall framework, development concept, and operation mode of various fields. Among them, artificial intelligence is widely used in the field of education, which has also attracted the attention of universities and many researchers [4].…”
Section: Introductionmentioning
confidence: 99%
“…With the help of artificial intelligence, all fields of society have shown a broad development space, which not only changes the development mode of various fields but also changes the overall framework, development concept, and operation mode of various fields. Among them, artificial intelligence is widely used in the field of education, which has also attracted the attention of universities and many researchers [4].…”
Section: Introductionmentioning
confidence: 99%
“…When the value of e is closer to 1, the mobile robot tends to explore the environment; when the value of e is closer to 0, the mobile robot tends to use the action with the largest Q value in the current environment. Setting the e factor, This strategy may fall into the local optimum and have low convergence efficiency, so a simulated annealing algorithm [27] is introduced to dynamically adjust the e-factor. The algorithm jumps out of the local optimal solution by searching for the difference between the front and back energy produced by the solid temperature change during the iteration process.…”
Section: State-action Decision Space Improvementmentioning
confidence: 99%
“…Due to the observation space set in the algorithm, more environmental information needs to be collected in a shorter time to ensure the safety of the planned path. The euclidean distance between two action nodes in the environment space is determined to enable the robot to obtain well-directed environmental information, and computational relationship between distance and reward is shown in equation (27). where (x , t y t ) and ( + x , t 1 + y t 1 ) are the coordinates of the robot's movement nodes at moments t and t + 1, respectively.…”
Section: Dual Reward Mechanismmentioning
confidence: 99%