Fuzzy Reinforcement Learning and Curriculum Transfer Learning for Micromanagement in Multi-Robot Confrontation

Hu, Chunyang; Meng, Xu

doi:10.3390/info10110341

Cited by 5 publications

(8 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, to complete a situation assessment of one team in the football field, it is too rough to give the current situation with "strengths" or "weaknesses", and rough assessment will lead to an inaccurate real-time evaluation of the scene. Secondly, the conventional deep neural network model for situation assessment often has poor learning performance [23]. The optimization method based on the training process may be a solution to improve the performance of deep neural networks.…”

Section: Introductionmentioning

confidence: 99%

A Situation Assessment Method with an Improved Fuzzy Deep Neural Network for Multiple UAVs

et al. 2020

View full text Add to dashboard Cite

To improve the intelligence and accuracy of the Situation Assessment (SA) in complex scenes, this work develops an improved fuzzy deep neural network approach to the situation assessment for multiple Unmanned Aerial Vehicle(UAV)s. Firstly, this work normalizes the scene data based on time series and use the normalized data as the input for an improved fuzzy deep neural network. Secondly, adaptive momentum and Elastic SGD (Elastic Stochastic Gradient Descent) are introduced into the training process of the neural network, to improve the learning performance. Lastly, in the real-time situation assessment task for multiple UAVs, conventional methods often bring inaccurate results for the situation assessment because these methods don’t consider the fuzziness of task situations. This work uses an improved fuzzy deep neural network to calculate the results of situation assessment and normalizes these results. Then, the degree of trust of the current result, relative to each situation label, is calculated with the normalized results using fuzzy logic. Simulation results show that the proposed method outperforms competitors.

show abstract

Section: Introductionmentioning

confidence: 99%

A Situation Assessment Method with an Improved Fuzzy Deep Neural Network for Multiple UAVs

et al. 2020

View full text Add to dashboard Cite

show abstract

“…In a recent study, the performance comparison between the GA and RL algorithm is done using different scenarios defined for the RoboCode environment [25]. An improved Q-learning technique in Semi-Markov decision processes is validated by using the RoboCode environment in [26]. Q-learning is one of the leading off-policy RL algorithms, preferred in another recent study due to its efficiency and popularity.…”

Section: Battlingmentioning

confidence: 99%

A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q‐Learning

et al. 2021

View full text Add to dashboard Cite

This paper addresses a new machine learning-based behavioral strategy using the deep Q-learning algorithm for the RoboCode simulation platform. According to this strategy, a new model is proposed for the RoboCode platform, providing an environment for simulated robots that can be programmed to battle against other robots. Compared to Atari Games, RoboCode has a fairly wide set of actions and situations. Due to the challenges of training a CNN model for such a continuous action space problem, the inputs obtained from the simulation environment were generated dynamically, and the proposed model was trained by using these inputs. The trained model battled against the predefined rival robots of the environment (standard robots) by cumulatively benefiting from the experience of these robots. The comparison between the proposed model and standard robots of RoboCode Platform was statistically verified. Finally, the performance of the proposed model was compared with machine learning based-customized robots (community robots). Experimental results reveal that the proposed model is mostly superior to community robots. Therefore, the deep Q-learning-based model has proven to be successful in such a complex simulation environment. It should also be noted that this new model facilitates simulation performance in adaptive and partially cluttered environments.

show abstract

“…Real-time strategy (RTs) games usually have a time-varying scene, which is different from board games [9]. In many traditional RTs games, StarCraft has a large number of players and a large number of competitions, which requires different countermeasures, tactics and even control techniques, so it has attracted the attention of international scholars [10].…”

Section: Multi-agent Confrontation In the Real-time Strategy Gamementioning

confidence: 99%

“…In the game scene, the process of decision-making for multi-agent systems is regarded as an SMDPs process [9]. Figure 4 shows the SMDPs process.…”

Section: Optimal Control Problem Using Value Function In Smdps Processmentioning

confidence: 99%

See 1 more Smart Citation

A Confrontation Decision-Making Method with Deep Reinforcement Learning and Knowledge Transfer for Multi-Agent System

2020

Symmetry

View full text Add to dashboard Cite

In this paper, deep reinforcement learning (DRL) and knowledge transfer are used to achieve the effective control of the learning agent for the confrontation in the multi-agent systems. Firstly, a multi-agent Deep Deterministic Policy Gradient (DDPG) algorithm with parameter sharing is proposed to achieve confrontation decision-making of multi-agent. In the process of training, the information of other agents is introduced to the critic network to improve the strategy of confrontation. The parameter sharing mechanism can reduce the loss of experience storage. In the DDPG algorithm, we use four neural networks to generate real-time action and Q-value function respectively and use a momentum mechanism to optimize the training process to accelerate the convergence rate for the neural network. Secondly, this paper introduces an auxiliary controller using a policy-based reinforcement learning (RL) method to achieve the assistant decision-making for the game agent. In addition, an effective reward function is used to help agents balance losses of enemies and our side. Furthermore, this paper also uses the knowledge transfer method to extend the learning model to more complex scenes and improve the generalization of the proposed confrontation model. Two confrontation decision-making experiments are designed to verify the effectiveness of the proposed method. In a small-scale task scenario, the trained agent can successfully learn to fight with the competitors and achieve a good winning rate. For large-scale confrontation scenarios, the knowledge transfer method can gradually improve the decision-making level of the learning agent.

show abstract

Fuzzy Reinforcement Learning and Curriculum Transfer Learning for Micromanagement in Multi-Robot Confrontation

Cited by 5 publications

References 33 publications

A Situation Assessment Method with an Improved Fuzzy Deep Neural Network for Multiple UAVs

A Situation Assessment Method with an Improved Fuzzy Deep Neural Network for Multiple UAVs

A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q‐Learning

A Confrontation Decision-Making Method with Deep Reinforcement Learning and Knowledge Transfer for Multi-Agent System

Contact Info

Product

Resources

About