2022
DOI: 10.3390/drones6070166
|View full text |Cite
|
Sign up to set email alerts
|

Multiple-UAV Reinforcement Learning Algorithm Based on Improved PPO in Ray Framework

Abstract: Distributed multi-agent collaborative decision-making technology is the key to general artificial intelligence. This paper takes the self-developed Unity3D collaborative combat environment as the test scenario, setting a task that requires heterogeneous unmanned aerial vehicles (UAVs) to perform a distributed decision-making and complete cooperation task. Aiming at the problem of the traditional proximal policy optimization (PPO) algorithm’s poor performance in the field of complex multi-agent collaboration sc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 26 publications
(10 citation statements)
references
References 21 publications
0
4
0
Order By: Relevance
“…Planning a multi-agent system is even more challenging since the planning space increases exponentially with the number of agents [24]. Due to the exponential growth of the size of the planning space with respect to the number of agents, scalability is recognized as a crucial limitation for the application of multi-agent planning algorithms to real-world scenarios [25]. Instead of using standard approaches such as domain-knowledge-supported algorithms that require human-level intervention in the planning phase, more popular techniques such as deep learning will be utilized.…”
Section: Significance Of Scalabilitymentioning
confidence: 99%
“…Planning a multi-agent system is even more challenging since the planning space increases exponentially with the number of agents [24]. Due to the exponential growth of the size of the planning space with respect to the number of agents, scalability is recognized as a crucial limitation for the application of multi-agent planning algorithms to real-world scenarios [25]. Instead of using standard approaches such as domain-knowledge-supported algorithms that require human-level intervention in the planning phase, more popular techniques such as deep learning will be utilized.…”
Section: Significance Of Scalabilitymentioning
confidence: 99%
“…The samples were distributed in proportion to the data in the original dataset [9]. The next step was to find and slightly modify the architecturally constructed neural network used in studies on detecting the presence and types of landmines [10]. The selected neural network consists of four layers of twodimensional convolution, five layers of one-dimensional maximum pooling, and nine layers of three-dimensional convolution.…”
Section: Detecting Landmines and Minefields Using Ai-mentioning
confidence: 99%
“…These training methods can train a new model of complex tasks based on the model trained from simple tasks and accelerate the convergence speed. An inheritance training method [ 16 ] based on the multi-agent proximal policy optimization method is developed to improve the generalization performance of the model. The idea of course learning is adopted in the method, and the results show that UAVs can search for and attack targets outside the training area.…”
Section: Introductionmentioning
confidence: 99%