“…To verify the effectiveness of the algorithm in this scenario, we conducted 500 episodes of tests on the MA-SAC algorithm in this environment and compared it with other (1) Initialize environment (2) Initialize critic network and actor network (3) Initialize max episodes, replay buffer, batch size (4) for episode ∈ [1, episodes] do (5) Reset environment (6) Get current state s i for each agent, i (7) for step ∈ [1, steps] do (8) Select actions a i for each agent v i (9) Get all agents next states s i ′ and rewards r i (10) Store < a i , s i , s i ′ , r i > to replay buffer D (11) if D size > B size then (12) Sample batch B from replay buffer D (13) for v i , where i � 1:N do (14) Update the critic network (15) Update the actor network (16) Update the target network according to formulas ( 15), ( 16) (17) end for (18) end if (19) end for (20) Figure 4 shows the dynamic assignment process of UAVs in the task area before training. At this time, none of the three UAVs has learned any strategy, so they are in an exploration state in the environment.…”
Section: Experimental Results and Analysismentioning
confidence: 99%
“…Optimization methods include Hungarian algorithm [15,16], branch-and-bound method [17], and other commonly used integer linear programming methods. ese algorithms are only applicable to scenarios with simple tasks and small UAV scale.…”
With the increasing complexity of UAV application scenarios, the performance of a single UAV cannot meet the mission requirements. Many complex tasks need the cooperation of multiple UAVs. How to coordinate UAV resources becomes the key to mission completion. In this paper, a task model including multiple UAVs and unknown obstacles is constructed, and the model is transformed into a Markov decision process (MDP). In addition, considering the influence of strategies among UAVs, a multiagent reinforcement learning algorithm based on SAC algorithm and centralized training and decentralized execution framework, MA-SAC (Multi-Agent Soft Actor-Critic), is proposed to solve the MDP. Simulation results show that the algorithm can effectively deal with the task allocation problem of multiple UAVs in this scenario, and its performance is better than other multiagent reinforcement learning algorithms.
“…To verify the effectiveness of the algorithm in this scenario, we conducted 500 episodes of tests on the MA-SAC algorithm in this environment and compared it with other (1) Initialize environment (2) Initialize critic network and actor network (3) Initialize max episodes, replay buffer, batch size (4) for episode ∈ [1, episodes] do (5) Reset environment (6) Get current state s i for each agent, i (7) for step ∈ [1, steps] do (8) Select actions a i for each agent v i (9) Get all agents next states s i ′ and rewards r i (10) Store < a i , s i , s i ′ , r i > to replay buffer D (11) if D size > B size then (12) Sample batch B from replay buffer D (13) for v i , where i � 1:N do (14) Update the critic network (15) Update the actor network (16) Update the target network according to formulas ( 15), ( 16) (17) end for (18) end if (19) end for (20) Figure 4 shows the dynamic assignment process of UAVs in the task area before training. At this time, none of the three UAVs has learned any strategy, so they are in an exploration state in the environment.…”
Section: Experimental Results and Analysismentioning
confidence: 99%
“…Optimization methods include Hungarian algorithm [15,16], branch-and-bound method [17], and other commonly used integer linear programming methods. ese algorithms are only applicable to scenarios with simple tasks and small UAV scale.…”
With the increasing complexity of UAV application scenarios, the performance of a single UAV cannot meet the mission requirements. Many complex tasks need the cooperation of multiple UAVs. How to coordinate UAV resources becomes the key to mission completion. In this paper, a task model including multiple UAVs and unknown obstacles is constructed, and the model is transformed into a Markov decision process (MDP). In addition, considering the influence of strategies among UAVs, a multiagent reinforcement learning algorithm based on SAC algorithm and centralized training and decentralized execution framework, MA-SAC (Multi-Agent Soft Actor-Critic), is proposed to solve the MDP. Simulation results show that the algorithm can effectively deal with the task allocation problem of multiple UAVs in this scenario, and its performance is better than other multiagent reinforcement learning algorithms.
“…For monitoring complex environments, the Hungarian algorithm can efficiently allocate monitoring tasks to UAVs based on the monitoring efficiency of the UAVs and the importance of the monitoring areas, aiming to minimize the total monitoring time or cost [22].…”
In recent years, the collaborative operation of multiple unmanned aerial vehicles (UAVs) has been an important advancement in drone technology. The research on multi-UAV collaborative flight path planning has garnered widespread attention in the drone field, demonstrating unique advantages in complex task execution, large-scale monitoring, and disaster response. As one of the core technologies of multi-UAV collaborative operations, the research and technological progress in trajectory planning algorithms directly impact the efficiency and safety of UAV collaborative operations. This paper first reviews the application and research progress of path-planning algorithms based on centralized and distributed control, as well as heuristic algorithms in multi-UAV collaborative trajectory planning. It then summarizes the main technical challenges in multi-UAV path planning and proposes countermeasures for multi-UAV collaborative planning in government, business, and academia. Finally, it looks to future research directions, providing ideas for subsequent studies in multi-UAV collaborative trajectory planning technology.
“…Over the past recent years, autonomous aerial vehicles have been growingly drawing attention and broadening their spectrum of specifications [13][14][15][16] .…”
Section: Equipped Drones With Network Test Handset To Load the Networmentioning
Cellular network operators have problems to test their network without affecting their user experience. Testing network performance in a loaded situation is a challenge for the network operator because network performance differs when it has more load on the radio access part. Therefore, in this paper, deploying swarming drones is proposed to load the cellular network and scan/test the network performance more realistically. Besides, manual swarming drone navigation is not efficient enough to detect problematic regions. Hence, particle swarm optimization is proposed to be deployed on swarming drone to find the regions where there are performance issues. Swarming drone communications helps to deploy the particle swarm optimization (PSO) method on them. Loading and testing swarm separation help to have almost non-stochastic received signal level as an objective function. Moreover, there are some situations that more than one network parameter should be used to find a problematic region in the cellular network. It is also proposed to apply multi-objective PSO to find more multi-parameter network optimization at the same time.Cellular networks have been growing rapidly over the last few decades. This technology is exceptionally beloved these days because people can be connected as they move around anyplace. Wireless communication was developed even before cellular networks, but radio communication always suffers from resource limitation (frequency spectrum as the main resource). Resource limitation was the main motivation to propose resource reuse in cellular networks [1] . For example, if a frequency band is used in a certain cell in the cellular network to cover a particular area, it also can be used in another cell which is farther in the distance from the first one. The idea of resource reuse helps to increase the number of users being serviced. This idea of resource reuse is the main difference between cellular network technology and other radio networks. However, this adds to the complexity of fine-tuning of the network to achieve high-performance cells. The reason is that radio interference can decrease the quality of service in these networks. First-generation cellular networks started in the 1980s as analog radio communication. Later on, the 2G network was commercially launched around the
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.