Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot

Zhang, Meiyang; Cai, Wenyu; Pang, Lingfeng

doi:10.1109/access.2023.3255007

Cited by 6 publications

(4 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It seamlessly combines these two essential components, where exploration generates a map used by the coverage path planning algorithm. In contrast, several other approaches primarily focus on either exploration or coverage separately, and their explicit integration may not be as robust or evident in those cases [5], [6], [30], [33], [35]. Another standout feature of the proposed algorithm is its commitment to power efficiency.…”

Section: Resultsmentioning

confidence: 99%

“…article addresses path planning for multiple UAVs to achieve sweep coverage, especially focusing on forest fire early warning and monitoring. A Predator-Prey reward-based Q-Learning CPP, overcoming local optima challenges is studied in [35] and [36] introduces a visibility-based path planning (VPP) heuristic for optimizing visibility during UAV flights.…”

Section: Motivationmentioning

confidence: 99%

See 1 more Smart Citation

Efficient Path Planning Algorithm for Mobile Robots Performing Floor Cleaning Like Operations

Nair

2024

Journal of Robotics and Control

View full text Add to dashboard Cite

In this paper, we introduce an efficient path planning algorithm designed for floor cleaning applications, utilizing the concept of Spanning Tree Coverage (STC). We operate under the assumption that the environment, i.e., the floor, is initially unknown to the robot, which also lacks knowledge regarding obstacle positions, except for the workspace boundaries. The robot executes alternating phases of exploration and coverage, leveraging the local map generated during exploration to construct a STC tree, which then guides the subsequent coverage (cleaning) phase. The extent of exploration is determined by the range of the robot's sensors. The path generation algorithms for cleaning fall within the broader category of coverage path planning (CPP) algorithms. A key advantage of this algorithm is that the robot returns to its initial position upon completing the operation, minimizing battery usage since sensors are only active during the exploration phase. We classify the proposed algorithm as an offline-online scheme. To validate the effectiveness and non-repetitive nature of the algorithm, we conducted simulations using VRep/MATLAB environments and implemented real-time experiments using Turtlebot in the ROS-Gazebo environment. The results substantiate the completeness of coverage and underscore the algorithm's significance in applications akin to floor cleaning.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Motivationmentioning

confidence: 99%

Efficient Path Planning Algorithm for Mobile Robots Performing Floor Cleaning Like Operations

Nair

2024

Journal of Robotics and Control

View full text Add to dashboard Cite

show abstract

“…Hao et al [49] proposed a dynamic fast Q-learning (DFQL) algorithm for the path planning problem of USV in some known marine environments, which combines Q-learning with artificial potential field (APF) to initialize the Q table and provides USV with prior knowledge from the environment. Zhang et al [50] in order to overcome the problem that traditional Q-learning is prone to local optimization in coverage path planning, a new reward function derived from the predator-prey model is introduced into the traditional Q-learning-based CPP solution.…”

Section: Related Workmentioning

confidence: 99%

Optimized-Weighted-Speedy Q-Learning Algorithm for Multi-UGV in Static Environment Path Planning under Anti-Collision Cooperation Mechanism

Cao

Fang

2023

Mathematics

View full text Add to dashboard Cite

With the accelerated development of smart cities, the concept of a “smart industrial park” in which unmanned ground vehicles (UGVs) have wide application has entered the industrial field of vision. When faced with multiple tasks and heterogeneous tasks, the task execution efficiency of a single UGV is inefficient, thus the task planning research under multi-UGV cooperation has become more urgent. In this paper, under the anti-collision cooperation mechanism for multi-UGV path planning, an improved algorithm with optimized-weighted-speedy Q-learning (OWS Q-learning) is proposed. The slow convergence speed of the Q-learning algorithm is overcome to a certain extent by changing the update mode of the Q function. By improving the selection mode of learning rate and the selection strategy of action, the relationship between exploration and utilization is balanced, and the learning efficiency of multi-agent in complex environments is improved. The simulation experiments in static environment show that the designed anti-collision coordination mechanism effectively solves the coordination problem of multiple UGVs in the same scenario. In the same experimental scenario, compared with the Q-learning algorithm and other reinforcement learning algorithms, only the OWS Q-learning algorithm achieves the convergence effect, and the OWS Q-learning algorithm has the shortest collision-free path for UGVS and the least time to complete the planning. Compared with the Q-learning algorithm, the calculation time of the OWS Q-learning algorithm in the three experimental scenarios is improved by 53.93%, 67.21%, and 53.53%, respectively. This effectively improves the intelligent development of UGV in smart parks.

show abstract

“…Deep Q-learning (DQN) is an algorithm improved by Q-Learning [20][21][22]. The expression of the traditional reinforcement Q-learning algorithm is shown in the following formula:…”

Section: Dqnmentioning

confidence: 99%

Air Channel Planning Based on Improved Deep Q-Learning and Artificial Potential Fields

Li,

Shen,

et al. 2023

Aerospace

View full text Add to dashboard Cite

With the rapid advancement of unmanned aerial vehicle (UAV) technology, the widespread utilization of UAVs poses significant challenges to urban low-altitude safety and airspace management. In the coming future, the quantity of drones is expected to experience a substantial surge. Effectively regulating the flight behavior of UAVs has become an urgent and imperative issue that needs to be addressed. Hence, this paper proposes a standardized approach to UAV flight through the design of an air channel network. The air channel network comprises numerous single air channels, and this study focuses on investigating the characteristics of a single air channel. To achieve optimal outcomes, the concept of the artificial potential field algorithm is integrated into the deep Q-learning algorithm during the establishment of a single air channel. By improving the action space and reward mechanism, the resulting single air channel enables efficient avoidance of various buildings and obstacles. Finally, the algorithm is assessed through comprehensive simulation experiments, demonstrating its effective fulfillment of the aforementioned requirements.

show abstract

Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot

Cited by 6 publications

References 31 publications

Efficient Path Planning Algorithm for Mobile Robots Performing Floor Cleaning Like Operations

Efficient Path Planning Algorithm for Mobile Robots Performing Floor Cleaning Like Operations

Optimized-Weighted-Speedy Q-Learning Algorithm for Multi-UGV in Static Environment Path Planning under Anti-Collision Cooperation Mechanism

Air Channel Planning Based on Improved Deep Q-Learning and Artificial Potential Fields

Contact Info

Product

Resources

About